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Editorial 

Peggy Johnson 




Triting a timely editorial or one that will seem pertinent 
(or, with luck, entertaining) four months after I put fin- 
gers to the keyboard is always a challenge. We have snow on 
the ground in Minnesota and spring will have arrived by the 
time this issue reaches your mail boxes. The ALA Midwinter 
Meeting will be over and people will be getting ready for 
Annual Conference in Washington, D.C. 

That observation gives me the starting point for this 
editorial. The Association for Library Collections & Technical Services (ALCTS) 
will be a major player at the 2007 Annual Conference as we celebrate fifty 
years as an association serving the profession. The week of programming and 
events begins with a one-and-a-half-day conference (June 20-21), "Interactive 
Futures: A National Conference on the Transformation of Library Collections 
& Technical Services." Featured speakers are Richard A. Lanham and Stephen 
Abrams. Lanham is an author, lecturer, and UCLA English professor emeritus, 
whose works include The Electronic Word: Democracy, Technology, and the Art 
(Univ. of Chicago Pr., 1993) and The Economics of Attention (Univ. of Chicago 
Pr., 2006). Abram is vice president of innovation, Sirsi Corporation, and a lead- 
ing international librarian and provocative thinker in the North American library 
community. The conference features plenary and breakout sessions. It concludes 
with the ALCTS 50th anniversary gala dinner cruise on the Potomac River. This 
is a conference not to be missed! 

ALCTS also will be offering five preconferences, two of which are two-day 
events, one is a single day, and two are half days. They are: 

• Comprehensive Series Training 

• Basic Library of Congress Classification 

• What They Don't Teach in Library School: Competencies, Education, and 
Employer Expectations for a Career in Cataloging 

• Managing the Multigenerational Workplace: Practical Techniques 

• Workflow Analysis, Redesign, and Implementation: Integrating Electronic 
Resources 

With such a variety of topics and options for length of time involved, attend- 
ees are sure to find something that meets their needs and interests. 

In addition, ALCTS is sponsoring fourteen programs during the conference. 
All look splendid. The ALCTS President's Program, "Libraries and Findability: 
Elegant Hacks for Our Future," deserves special mention. Peter Moville is the 
keynote speaker and described as a passionate advocate of the role that "find- 
ability" plays in defining the user experience. Moville is the author of Ambient 
Findability (O'Reilly, 2005) and coauthor of Information Architecture for the 
World Wide Web: Designing Large-Scale Web Sites, 2nd ed. (O'Reilly, 2002). He 
is the president and founder of Semantic Studios, an information architecture, 
user experience, and findability consultancy. He is a graduate of University of 
Michigan's School of Information, where he is an adjunct faculty member. The 
President's Program will be Monday, June 25, 10:30 a.m., and is the final event 
in the ALCTS 50th Anniversary Celebration. 
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I encourage you to join the celebration. Detailed 
information is available at the ALCTS anniversary Web site 
(www.ala.org/alcts50). Be sure to visit the section "Looking 
Back," which has a list of past presidents, photos, trivia, 
and more. 

As I look toward the fiftieth anniversary festivities, I also 
ponder the future of ALCTS and LRTS. Both depend on 
you — being engaged in ALCTS activities, volunteering for 
service, and writing for publication. To that end, I am sug- 
gesting topics that would make excellent themes for papers 
to be submitted to LRTS. These topics could be the start- 
ing point for research projects or a catalyst for essays that 
thoughtfully consider one or more perspectives on a par- 
ticular topic. LRTS also publishes papers in a section called 
"Notes on Operations," which report practical applications 
and problems solutions that have implications beyond the 
library in which they occur. 

We have had few papers in the area of acquisitions 
recently, yet I think this is one of the exciting, fast-mov- 
ing areas in technical services. What new services are 
foreign vendors offering and how do they compare with 
those provided by U.S. vendors? How do these changing 
services from both foreign and domestic vendors affect 
workflow, allocation of staff time, and level of staff (type or 
classification) doing the work? Is anyone thinking about a 
standard elements for an acquisitions record? What about a 
historical review of the changes undergone and undertaken 
by monograph vendors to meet libraries' changing expecta- 
tions? Belated to this topic — are libraries changing prac- 
tices to mesh with changing services, or are libraries driving 
the changes in services that vendors supply? Does anyone 
have experience with the new WorldCat Selection service, 
based on the Integrated Tool for Selection and Ordering at 
Cornell University Library (ITSO CUL)? Perhaps some- 
one in a library implementing this service could collect 
data pre- and post-implementation and prepare an analysis 
for LRTS. 

The future of the catalog is the topic du jour and prob- 
ably de dix ans. Next generation catalogs and various initia- 
tives underway (such as PennTags, WorldCat Identities, and 
University of Illinois at Champaign-Urbana Libraries' Buy a 
Book service) to enhance current catalogs and services are 
fascinating. What do librarians at Endeca-using institutions 
have to tell us? Those that are looking to implement the Ex 
Libris Primo should be thinking now about how they can 
share their experiences and the results with colleagues. 

Preservation of digital content and traditional formats 
remains a critical topic. I would welcome exploration of 
solutions to digital archiving — helping readers understand 
LOCKSS, POBTICO, dark archives, semi-dark archives, 
light archives, perpetual archives, and where librarians and 
libraries have responsibilities. Do microforms still have 
a role in library collections? Are they being replaced by 



digitized content? If so, what are the responses from and 
consequences to users? 

The landscape for licensing digital content and access 
to digital content is changing rapidly. Perhaps now is the 
time for a paper that considers the Google project, the Open 
Content Alliance, and in-house projects — and how they are, 
together, building a new universe for information seekers. 
How do these new types of collections fit with traditional 
collection, use, and user assessment? Are libraries employ- 
ing statistical measures of usage for Web-based information 
resources? Have the guidelines for these measures pro- 
moted by the International Coalition of Library Consortia 
affected the practices of content providers? I've heard 
talk about a universal license. What has been the result of 
national e-content licenses in other countries? Licensors 
and licensees both have perspectives that can be explored 
and shared. 

I am especially interested in trends in the organization 
of technical services, the changing skill sets expected of 
professional librarians, and the expanding role of non-MLIS 
professionals. What defines original cataloging or, more to 
the point, what is the role of MLIS professionals in techni- 
cal services? 

Anyone who is considering writing a paper for LRTS 
should review the "Instructions for Authors" and "Author 
FAQ" sections on the LRTS Web site (www.ala.org/alcts/ 
lrts). The LRTS Editorial Board provides mentors to poten- 
tial authors, who are interested in this service. I conclude 
with my now familiar advice for aspiring authors. 

• Do not write a simple how-we-did-it good paper. 
Successful projects can be the basis of good papers, 
but they need to be placed in a larger context. Why 
should readers care? Have others tackled the prob- 
lems or written about it? What can readers learn from 
the project being reported? 

• Be attentive to grammar and spelling. Proofread and 
proofread again. 

• Check citations for accuracy. 

• Do not overshadow prose with illustrative matter. 
Most papers need no more dian six to eight (at most) 
figures and tables. The data represented in illustra- 
tive matter and dreir significance should be explained 
in the prose. Illustrative matter is not required; some 
papers do not need tables reporting quantitative and 
statistical findings or figures demonstrating topics 
addressed. 

• Be sure that your paper fits within the scope of 
LRTS. 

• Browse through recent issues of LRTS to get a sense 
of style, length, and tone. Bead the papers that 
received the Best of LBTS award. I commend to 
you: 
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Jennifer Bowen, "FRBR: Coming Soon to Your 
Library" 49, no. 3 (July 2005): 175-188. 
Kristin Antelman, "Identifying the Serial Work 
as a Bibliographic Entity" 48, no. 4 (Oct. 2004): 
238-55. 

Amy Weiss, "Proliferating Guidelines: A Histoiy 
and Analysis of the Cataloging of Electronic 
Resources," 47, no. 4 (Oct. 2003): 171-187. 



• LRTS is a scholarly journal. Your paper should reflect 
this while being readable. Ponderous prose is deadly. 

Finally, do not hesitate to contact a member of the 
Editorial Board or me if you wish to discuss a potential 
paper. 



ALCTS, LRTS 50th Anniversaries 

The stellar lineup of events highlighting the Association 
for Library Collections & Technical Services (ALCTS) 
50th anniversary celebration, "Commemorating our 
Past, Celebrating our Present, Creating our Future," 
will begin at Midwinter Meeting's Anniversary Year 
Kickoff Reception and will include the ALCTS National 
Conference, a gala dinner cruise, and the annual 
President's Program featuring Peter Morville, author 
of Ambient Findability and president of Semantic 
Studios. 

ALCTS members and Library Resources £r 
Technical Services (LRTS) subscribers will also enjoy 
anniversary articles, a complimen- 
tary copy of the fifty-year cumula- 
tive index to LRTS and die reissue 
of LRTS volume 1, number 1. A 
50th anniversary commemorative 
publication will be available for 
purchase through the ALA Online 
Store in late 2007. 
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Commemorating. 

Celebrating. 

Creating. 



Web Site Explains, Entertains 

The ALCTS anniversary Web site (www.ala.org/alcts50) 
provides information on all anniversary events, links to 
registration forms and information, and serves up pho- 
tos, trivia, and odier surprises. 



Contribute Your Thoughts, Photos 

A special 50th anniversary survey is being conducted to 
be an informal (and, we hope, fun) exercise for those 
who have chosen library careers to reflect on their time 
in the profession and size up their expectations for die 
future. Take the survey now by visiting the 50th anni- 
versary Web site (www.ala.org/alcts50) and clicking on 
the "Survey" link. 

Everyone is invited to contribute photos (candid or 
professional) to the ALCTS photo gallery on the Web 
site. Photos can range from shots of ALCTS events to 
family photos from trips to Seattle and Washington, 
D.C. Submission instructions are on the 50th anniver- 
sary Web site. 
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Quo Vddis, Preservation 
Education? 

A Study of Current Trends and 
Future Needs in Continuing 
Education Programs 

By Karen F. Gracy and Jean Ann Croft 

This research study assesses preservation education offered by continuing edu- 
cation (CE) providers in the United States. Educators teaching preservation 
workshops for regional field service organizations and other local and regional 
preservation networks were surveyed about the type and number of workshops 
offered, content of preservation offerings, audience, faculty resources, future plans 
for curricula, and availability of continuing education credits. The investigators 
hypothesize that preservation workshops offered by CE providers serve multiple 
purposes for the library and archival science professions, becoming not only an 
avenue for professional's to continue to develop or reinforce their knowledge and 
skills in preservation, but also often the primary source of rudimentary preserva- 
tion education for library and information science professionals and paraprofes- 
sionals. This pape r reviews the literature relevant to the study of preservation in 
the CE environment, describes the research methodology employed in designing 
and conducting the survey, presents the resulting data, and analyzes the trends 
revealed by the data in order to understand more fully the goals and objectives of 
CE in preservation during the last decade and to gauge future directions of the 
field. This paper concludes by presenting plans for further research, which will 
expand upon initial findings of this survey. 



Karen F. Gracy (kgracy@pitt.edu) is 
Assistant Professor, School of Information 
Sciences, and Jean Ann Croft (jeanann 
Opitt.edu) is Preservation Librarian, 
University Library Services, University of 
Pittsburgh. 

Submitted February 27, 2006; accepted 
for publication April 17, 2006, pend- 
ing revision; revision submitted May 24, 
2006, and accepted for publication. 



The Need for Continuing Education 
in the Field of Preservation 

As part of an overall desire to promote continuing professional development 
and to foster lifelong learning, continuing education (CE) provides an essen- 
tial service to library and information science (LIS) practitioners. It gives librar- 
ians, archivists, and other cultural heritage professionals essential information, 
skills, and insight throughout their career. Both the American Library Association 
(ALA) and the Society of American Archivists (SAA) affirm the value of CE in 
promoting lifelong learning for practitioners. 1 

Continuing education plays a particularly important role in sustaining the 
preservation imperative, as it often serves as the first or only source of informa- 
tion for professionals and support staff on how to protect and extend die life of 
library and archival materials. The 2005 Heritage Health Index, which aimed to 
"assess the condition and preservation needs of U.S. Collections," indicates the 
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fundamental need for preservation education: of the more 
than 30,000 American cultural institutions, responsible for 
more than 4.8 billion artifacts, 70 percent of collecting insti- 
tutions indicate a need to provide additional training and 
expertise for staff caring for their collections. 2 The LIS field 
must focus on providing practitioners with ample opportu- 
nities to increase their knowledge of preservation concepts 
and help them master key preservation skills, through both 
graduate and continuing education. 

Given the challenges to be faced in educating the next 
generation of LIS professionals to care for cultural heritage 
materials, the authors of this paper felt that the time was 
ripe to conduct a formal study of the state of continuing 
education. Thus, this research aims to thoroughly document 
activities in die field of continuing education for preserva- 
tion during the last decade, and offer suggestions for how 
CE providers can best place themselves to provide the 
needed knowledge and expertise to effectively administer 
preservation programs in libraries and archives. 

History of Preservation Continuing Education 
and Its Impact on the Preservation Field 

Education in preservation has a relatively brief history com- 
pared with that of other specializations within LIS. In the 
1970s, few graduate library science programs offered con- 
servation or preservation as a regular part of their curricu- 
lum. Continuing education offerings — primarily in the form 
of workshops and short courses — constituted the primary 
source of preservation education for most practitioners. 
Many current graduate school offerings in preservation 
can trace their roots to these pilot programs, as they were 
often first offered through university CE programs. 3 In the 
last three decades, many leading preservation professionals 
(both educators and administrators) focused their efforts on 
integrating preservation into graduate library science educa- 
tion. 4 These labors have been fruitful, as more than three- 
quarters of all LIS schools with ALA accreditation now offer 
at least one course in the area of preservation. 5 Continuing 
education was seen as playing a complementary role, how- 
ever. Its role was not particularly well-defined beyond the 
general recommendation to acquaint practitioners with the 
"basic tenets of preservation," and to serve as a potential 
route to specialization within the preservation field. 6 

In its 1991 report, the Preservation Education Task 
Force, organized by the Commission on Preservation and 
Access, suggested that CE efforts should focus on develop- 
ing short-term, intensive training programs for mid-career 
librarians and archivists, similar to the in-house training 
program found at the library system of the University of 
California-Berkeley. ' The reasoning behind this recommen- 
dation was that such programs were necessary because pres- 



ervation was not yet a part of most LIS graduate programs' 
curricula at that time. 

In the 1990s, several programs were launched in 
emulation of the short-term model, including the SAA 
Preservation Management Institute (1987) and its succes- 
sor, the Preservation Management Training Program (1992- 
1994); the Preservation Intensive Institute, first hosted by 
the University of Pittsburgh in 1993 and in 1994 at UCLA; 
and the Rutgers Preservation Management Institute (first 
held in 1998). As the names of these programs suggest, 
they emphasized the management aspects of preservation, 
rather than simply teaching basic skills such as book repair. 
They had significant impact on the LIS profession, as doz- 
ens of professionals graduating from these programs were 
able to integrate preservation administration principles into 
the management of their own institutions. 

Programs of this kind require a significant investment 
of time and resources, and rely heavily on subsidies from 
federal and regional funding agencies. Without such fund- 
ing, sustaining programs is difficult, as most potential stu- 
dents cannot afford them (unless their employers provide 
subsidies). For example, tuition for the most recent offering 
of the Rutgers Preservation Management Institute (PMI) 
in 2005 was $4,075, which covered the costs of fifteen days 
of instruction and the review of course assignments by 
instructors. This amount did not include costs for travel, 
accommodations, and meals. Scholar-ships from die National 
Endowment for the Humanities and the New Jersey 
Historical Commission covered tuition and travel-associ- 
ated costs for a dozen students; each offering of the PMI is 
limited to twenty students. 

Of the three major initiatives, only the Rutgers program 
has survived over the long term and continues to educate 
administrators to manage preservation programs. While the 
aims of these programs were admirable, the difficulties in 
sustaining intensive programs of this type mean that most of 
them remained experiments rather than successful models 
that could be duplicated in multiple venues. 

Given the high costs of intensive training, another 
model for continuing preservation education also grew and 
expanded during this period: the regional workshop, as 
offered by field service programs, professional associations, 
and other local preservation-focused organizations. The tar- 
get audience for these briefer offerings (most often held as 
half-day or one-day programs) has been much broader than 
for intensive programs, as educators aim to serve the needs 
of professionals and paraprofessionals at all levels of exper- 
tise, not just mid-career professionals. Workshop providers 
focus on providing training in key areas such as disaster 
preparedness and recovery, management of environmental 
conditions, and book repair. While the management per- 
spective is still central to most of these workshops, the broad 
spectrum of the potential audience and the limited time 
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available for instruction often leads to a focus on training 
and skills rather than analysis and synthesis of preservation 
concepts. 

The work of Cloonan provides an interesting perspec- 
tive on approaches to preservation education. 9 Cloonan's 
research targeted respondents in various institutional envi- 
ronments as well as international settings. Utilizing inter- 
views and questionnaires, the author surveyed respondents 
and sought feedback concerning what they identified as 
issues and challenges in preservation education and sug- 
gested resolutions to the problems. In considering the 
differences in focus and objectives between graduate and 
continuing education in preservation, Cloonan made a dis- 
tinction among several related concepts: training, education, 
and continuing education: 

Training usually implies the learning of specific or 
specialized skills, often in a workshop setting; for 
example, disaster recovery, care of photographic 
prints, book repairs, or monitoring the library envi- 
ronment. Education is a more comprehensive term 
which refers not only to acquiring skills, but also to 
obtaining knowledge through experience, creativ- 
ity, analysis, and the exchange of ideas. Education 
is life-long while training takes place over a finite 
period of time. Continuing education can take 
place at any stage of one's career. It may consist 
of refresher courses, or may lead to certificates 
of advanced study. Library schools, libraries, and 
professional associations offer continuing educa- 
tion programs. 10 

Although these distinctions are helpful in theory, in 
practice the lines between training and continuing educa- 
tion are often blurred in preservation CE offerings. For the 
purposes of this study, die investigators chose to combine 
the categories of training and continuing education together 
under the category of continuing education. 

Furthermore, other organizations in addition to univer- 
sities, libraries, and associations have taken on responsibil- 
ity for CE as Cloonan defined it. Although a number of 
graduate education providers continue to offer CE courses 
to the LIS community, the fiscal realities of running a self- 
sufficient CE program (one that may have been heavily 
subsidized by the institution or external grants) have led 
many information schools to bow out as CE providers, par- 
ticularly in those areas where the audience may not be large, 
or where a region is already well-served by a field service 
provider. 11 This trend away from universities as preservation 
CE providers and toward other organizations also affected 
how the investigators chose to define the population for this 
study; see the Current Sources of Continuing Education for 
Preservation and Research Method sections that follow. 



Current Sources of Continuing Education 
for Preservation 

In the United States, many different organizations offer 
continuing education on preservation topics; sources include 
field service programs of regional conservation centers and 
library consortia, local preservation networks, universities, 
and professional associations. Although some of these edu- 
cation providers offer preservation workshops (particularly 
those dealing with popular topics such as book repair or 
disaster recovery) on a regular basis, others offer preserva- 
tion topics sporadically, as the need arises, or upon request. 

The organizations comprising the Regional Alliance for 
Preservation (RAP) have become among the most reliable 
sources for preservation education. RAP is a network of 
organizations devoted to preservation and conservation of 
cultural objects that provide assistance to library, archive, 
and museum professionals across the country. RAP orga- 
nizations focusing on preservation of library and archival 
materials include the Northeast Document Conservation 
Center (NEDCC), the Conservation Center for Art and 
Historic Artifacts (CCAHA), Amigos Library Services, and 
the Southeastern Library Network (SOLINET). All of them 
consider education to be part of their mission and have 
developed an ongoing curriculum in preservation. 

Other regional and local organizations, such as the 
California Preservation Clearinghouse, the Massachusetts 
Board of Library Commissioners, and the New York State 
Program on the Conservation/Preservation of Library 
Research Materials, also play an important role in providing 
preservation education to practicing professionals and para- 
professionals. These local organizations often work with RAP 
institutions to offer workshops, with the local preservation 
network providing the venue and the RAP member providing 
qualified instructors. Associations, while serving as a critical 
source of CE workshops, are not always consistent provid- 
ers of CE programs. Most association CE offerings are tied 
to annual conferences and must be proposed by members 
of the association each year, thus one cannot count on the 
same topics being offered regularly. The primary exception 
to this situation is SAA, which offers a full slate of regional 
workshops through its CE program in addition to its confer- 
ence offerings. 

New Directions for Continuing Education 

While core topics such as disaster response and recovery, 
management of environmental conditions, and book repair 
continue to be the mainstay of continuing education in pres- 
ervation, CE providers also strive to address digital pres- 
ervation issues. Thus far most CE programs have focused 
primarily on using digitization to reformat objects. The 
School for Scanning, a three-day symposium hosted by the 
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NEDCC, was a pioneer in providing education in how to 
manage digitization projects (its target audience is preserva- 
tion administrators). The ongoing preservation of digitized 
and "born-digital" materials has received far less attention 
to date, although that is slowly changing as the field begins 
to embrace digitization as a preservation reformatting meth- 
od. 12 The recent introduction of workshops that aim to give 
a general overview of the critical issues surrounding digital 
preservation indicate that the field is beginning to move 
beyond the building of digital libraries, to the maintenance 
of these new resources over time. While the preservation 
community recognizes the need for educating librarians and 
archivists in how to preserve the massive quantities of digital 
materials in their care, the lack of concrete strategies and 
standards continue to frustrate both educators and potential 
audiences for CE workshops in digital preservation. 

CE programs are also moving beyond the care of 
paper-based materials, to target visual materials, sound 
recordings, and moving images. According to a 2001 study 
of Association of Research Libraries (ARL) members, 
the holdings of ARL libraries include 1.3 million moving 
images, 5.3 million sound recordings, and more than 64 mil- 
lion graphic materials. 13 The Heritage Health Index, which 
includes many more institutions, indicates that cultural 
institutions hold 40.2 million moving images, 46.4 million 
sound recordings, and 724.4 million items in photographic 
collections. 14 Yet, many librarians and archivists with preser- 
vation responsibilities are not adequately prepared to care 
for these media. Most graduate courses and CE workshops 
that focus on the basics of preservation give scant attention 
to the care of media other than paper-based material or still 
photographs. Although some CE workshops specializing in 
these media exist, they are not offered with the same regu- 
larity as other courses, often being seen as special topics or 
part of an advanced curriculum rather than being included 
at the introductory level. 

Comparing the Roles of Graduate and Continuing 
Education in Preservation 

The division between preservation education in graduate 
programs and through CE is murky, as the curriculum 
of graduate and CE courses often overlaps significantly. 
One might trace the reasons for this overlap to two fac- 
tors: the relatively small number of professionals exposed 
to preservation in graduate school (less than 5 percent of 
all MLIS recipients include preservation as part of their 
coursework), leading to a large number of practitioners who 
must then pursue basic preservation education elsewhere; 
and, the large number of paraprofessionals given preserva- 
tion responsibilities who do not have access to preservation 
education through a formal degree program. 1 

Because of this blurring of the line between graduate 
and continuing education for preservation, the authors of 



this study hypothesize that the opportunities offered by CE 
providers go beyond simply facilitating lifelong learning 
objectives. They aim to close the gap in the knowledge base 
of LIS practitioners that cannot be filled satisfactorily by 
formal educational programs or on-the-job training alone. 
While they aspire to serve multiple audiences and a variety 
of purposes for the library and archival science profes- 
sions, they now function as the de facto primary source 
of rudimentary preservation education for LIS profession- 
als and paraprofessionals. As a corollary hypothesis, this 
study suggests that current preservation education within 
traditional library and archival studies programs does not 
provide adequate preparation in the areas of technical and 
managerial expertise to deal with the preservation of digi- 
tal collections, audiovisual media, or visual materials. The 
investigators approached these problems as issues worthy 
of research, in order to document the current situation and 
place these issues on the national LIS educational agenda. 
Specifically, the investigators sought to address the following 
research questions: 

1. What is the composition of curricula for CE programs 
in preservation? How has that curricula changed over 
the past decade? 

2. What is the relationship between graduate and con- 
tinuing education in preservation? 

3. How do educators plan to keep pace with new formats 
and technological advancements? 

4. Do preservation educators provide students with the 
opportunity to put theory into practice? If so, how is 
this achieved? 

5. What do preservation educators see as the key knowl- 
edge and values in preservation education? How are 
these values reflected in the curricula? 

The following report summarizes the results of the research 
undertaken to find answers to the previous questions. 

Research Method 

This survey aimed to document the extent and breadth of 
offerings found in continuing education offerings spon- 
sored by field service programs and other regional or local 
networks. The survey also attempted to gauge the attitudes 
and views of preservation educators across the spectrum of 
preservation education in relation to topics such as growth 
of the field. 

Establishing a Working Population of Preservation 
Education Providers 

This assessment of preservation education was directed 
toward CE providers in the United States. The popula- 
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tion of CE providers proved to be an amorphous group, 
thus recipients of surveys were identified in several ways. 
The investigators used a combination of sources, includ- 
ing a listing of members of RAP (which consists of field 
service providers), listings in the eighth edition of the ALA 
Preservation Education Directory (published in 2002), and 
recommendations from colleagues. 16 The research team 
also sent out a general call via several electronic discussion 
groups: the Preservation Administration Discussion Group, 
or PADG; jESSE (a list devoted to discussion of library 
and information science education issues); and Forum for 
Archival Educators (a private discussion list whose members 
are educators in archival studies programs). 1 ' Additionally, 
a Web site was set up to allow individuals involved in 
preservation education to request a survey. Finally, an 
announcement was published in October 2003 issue of The 
Abbey 'Newsletter, a periodical devoted to current news and 
developments in library and archival preservation. 19 

The main criterion for including an education provider 
in the study was evidence that the organization was commit- 
ted to offering preservation workshops with some regularity 
(i.e., at least once a year). An examination of the information 
provided in the ALA Preservation Education Directory and 
the organization's Web site (if one existed) served as the 
primary method that was used to make this determination. 
The investigators may have underestimated the size of the 
CE provider population, in that they may have failed to 
identify ad hoc or regional organizations; however, these 
methods provided a feasible sampling frame with which 
to proceed with the study. When multiple responses were 
received from the same institution, the researchers com- 
pared responses and selected the most reliable. 

To encourage participation, survey recipients were 
assured of the confidentiality of their responses. Because 
of this requirement, the investigators were sometimes 
required to aggregate data in order to maintain the confi- 
dentiality of participants despite the small size of the work- 
ing population. 

Description of the Survey Instrument 

The survey (see appendix) was sent to field service providers 
and other organizations identified as sources of continuing 
education. The research team targeted those individuals 
identified as being in charge of educational offerings. The 
investigators asked questions dealing with the following 
topics: 

• type and number of workshops offered; 

• frequency of workshop offerings; 

• enrollment statistics; 

• existence of credential in preservation and/or award 
of CE credits; 



• content of preservation workshops; 

• incorporation of preservation into related work- 
shops; 

• faculty resources; 

• future plans for curricula; and 

• audience for workshops. 

In total, 38 surveys were sent to potential participants; 
this list consisted of educators identified through the ini- 
tial compilation of the working population (as previously 
detailed). Although postings were made to various electron- 
ic discussion groups as previously detailed, the investigators 
received no additional requests for the survey from educa- 
tors who were not on this initial list. Recipients who did not 
respond to the call to participate were sent a reminder after 
six weeks; a second reminder was sent twelve weeks after 
the initial contact to those who still had not responded. After 
three attempts at contact, the research team considered the 
data collection period to be closed. 

To standardize coding and subsequent analysis of data, 
the survey used checkboxes wherever possible, and refrained 
from open-ended questions as much as possible. Where par- 
ticipants were asked to fill in answers (for example, "list each 
preservation workshop offered"), the investigators created 
nominal coding categories to aggregate data. 

Potential Sources of Bias 

The investigators see several potential sources of bias in this 
research. First, the data may be slanted toward those indi- 
viduals who are predisposed to participate in surveys. Field 
service providers were more apt to respond, as education is 
often a central part of their organizational mission. Second, 
answers to certain questions about future plans in hiring 
and curriculum should be treated somewhat cautiously. 
Respondents who were not full-time employees of an orga- 
nization may not have had a complete understanding of the 
current situation regarding hiring or curriculum revision. 
Additionally, some organizations may be wary about reveal- 
ing plans in this area (despite assurances of anonymity) for 
fear of being seen as making a firm commitment to hiring of 
new instructors or offering new workshops. 

A second source of bias lies in the definition of the 
working population for this study. Early in the research proj- 
ect, investigators made the decision to exclude professional 
associations as part of the population on the observation 
that many associations do not regularly offer preservation 
workshops as part of an established CE program (as most 
workshops are tied to conferences). SAA is the primary 
exception, as has been previously noted. In retrospect, the 
research team admits that the exclusion of association data 
may slightly skew the overall trends identified and conclu- 
sions reached in this study. 
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The most significant potential bias of this research 
concerns truthfulness in reporting data. For the questions 
that asked respondents to provide hard numbers (par- 
ticularly about enrollment figures over a five-year span), 
several participants indicated that the numbers they were 
providing were estimates or guesses since they had not kept 
good records of such data. Thus die researchers exercised 
extreme caution in interpreting these statistics, with the 
understanding that they may not be exact representations of 
the phenomenon being measured. 



Findings and Discussion 

In total, the research team received a total of 20 completed 
surveys from CE providers. This number was reduced 
slightly due to the removal of institutions or organizations 
that identified themselves as being outside of the work- 
ing population, leaving 18 useable surveys. Revising their 
population size to 36 providers, the investigators calculate 
the response rate as 50 percent (numbers do not include 
surveys removed for the previously noted reasons). This rate 
offers some reassurance that the research team may rely on 
the results to be statistically accurate. The extremely small 
population size in question leads them to be extremely cau- 
tious in interpreting results and their potential implications, 
however. 

The investigators used a standard 
statistical analysis package, SPSS, for 
all survey data entry and analysis. The 
primary analysis used was frequency 
distribution; this data is presented in 
tabular form, with discussion follow- 
ing each table. 



preservation workshops are more likely to offer a series of 
sessions touching upon preservation issues rather than just 
a single workshop (Q2): out of 13 respondents, 10 organiza- 
tions (76.9 percent) offer more than 3 workshops, 2 organi- 
zations (15.4 percent) offer 3 workshops, and 1 organization 
offers 2 workshops (7.7 percent). The investigators interpret 
these results to be an indication of the popularity of preser- 
vation as a topic for CE workshops. The hands-on nature of 
many of these programs appeals to both professionals and 
paraprofessionals, who see them as having practical use (see 
also the discussion below of reasons for attending preserva- 
tion workshops). 

Enrollment in Preservation Workshops 

CE providers were asked to list the workshops they offered 
by title, indicate their frequency, and give the enrollment 
figures for the period of 1999-2003 (Q3) (see table 1). 
Unfortunately, the investigators are unable to report the 
total number of workshops offered in this period, due to 
variations in the way that this data was reported (some 
respondents did not indicate how many times in a year that 
certain workshops were offered). 

Table 2 data show that disaster planning and emergency 
management workshops have consistently had the most 
appeal for CE students. The topic is offered by the majority 
of respondents, has high enrollment, and is most likely to be 



Table 1. Frequency of preservation workshops offered by continuing education providers 
(N varies) 



More than 



Irregular or 
unspecified 



Total 
number of 



Survey Responses 

Readers are invited to consult the 
appendix to examine the survey instru- 
ment; the report uses the abbreviation 
"Q" followed by the question number 
to indicate from which question the 
data are drawn (thus, Ql refers to 
Question 1). 

Availability of Course Offerings 

As stated previously, 18 surveys from 
CE providers were used in the final 
analysis. Of those 18 usable surveys, 
13 organizations indicated that they 
offered workshops in preservation 
(Ql). Those organizations drat teach 



Type of workshop 


Annually 


Biannually 


once a year 


frequency 


provider 


Care and handling/ 












collections conservation 


2 


1 


4 


5 


12 


Book repair 





1 


2 


3 


6 


Commercial binding 


1 








1 


2 


Management of 












environment/pest and 












mold control 


2 





1 


3 


6 


Disaster planning/ 












emergency management 


3 





4 


5 


12 


Exhibits and security 








1 


4 


5 


Care of time-based 












(audiovisual) and visual 












materials 


1 








4 


5 


Reformatting and 












digitization 








2 


2 


4 


Grant writing and 












fund-raising 


1 








2 


3 
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offered two or more times a year. The investigators suspect 
that the spike in enrollment for these workshops in 2000 may 
have been due to a state-sponsored program that promoted 
disaster planning in that year. Given the continued interest 
in disaster planning in the wake of recent natural disasters 
such as Hurricane Katrina and concern over terrorism, the 
researchers suggest that interest is likely to remain strong. 
Workshops focusing on the management of environmental 
conditions, including pest and mold control, show steady 
enrollment. Interest in the programs is not surprising, as 
they complement offerings in disaster planning. Problems 
with pests and mold often materialize as a result of water- 
related disasters. In considering the enrollment in 2001, 
the investigators believe that this increase may be another 
example of a state or regionally sponsored educational offer- 
ing that generated the upswing. 

The popularity of care, handling, and book repair pro- 
grams also seems fairly consistent over the five-year period; 
half of the respondents report offering care and handling at 
least once a year (one-third of them offer it more than once 
a year). Investigators suspect that many of the enrollees in 
these classes are either paraprofessionals or professional 
librarians who did not have the opportunity to take preser- 
vation in their LIS graduate program. Also, those students 
who may have had exposure to the administrative side of 
preservation in previous courses, but not some of the more 
technical aspects, may find this workshop to be of interest. 
This topic also holds appeal for those practitioners working 
in institutions where resources 
are minimal; improvements in 
care and handling of materials, 
such as proper shelving and 
housekeeping, are often inex- 
pensive to implement. 

The data reveal several 
other interesting trends, par- 
ticularly the increasing interest 
in the preservation of audio- 
visual media. Workshops in 
time-based and visual materi- 
als show steady increases in 
enrollment from 1999 to 2003, 
as more and more cultural 
heritage professionals become 
cognizant of the importance 
of preserving these types of 
materials. 

Reformatting and digiti- 
zation workshops are still in 
demand, although the down- 
ward trend indicates that their 
initial appeal may be waning 
somewhat due to die matura- 



tion of institutional practices in establishing and sustaining 
digitization projects. While the investigators speculate that 
these classes initially attracted many librarians and archivists 
who were given the responsibility for managing or initiat- 
ing digitization projects, the demand for this information 
also may be partially fulfilled through graduate education 
offerings in digital libraries that have emerged in the past 
decade. 

The small number of individuals taking workshops in 
commercial binding may be tied to the reduction in the 
number of print subscriptions in favor of electronic journal 
subscriptions, as well as increased interest in reallocating 
staff and fiscal resources to digitization projects. These 
trends are not surprising, given the proliferation of new 
media as part the responsibilities of librarians and archi- 
vists. The heterogeneity of most collections demands that 
information professionals become versed in the preservation 
requirements of many different types of media. 

Shifting resource and budgetary management may also 
affect grant writing and fund-raising efforts. While work- 
shops focusing on these areas currently have the benefit of 
a solid enrollment rate, the investigators expect that institu- 
tions will continue to place a greater emphasis on securing 
outside funding, which may drive enrollment rates higher. 
The strain on operating budgets will compel institutions to 
educate their staff in how to write viable grant proposals 
that will stand out as superior in an increasingly competitive 
funding environment. 



Table 2. Enrollment statistics for preservation workshops, 1 999-2003 

Total 

Type of workshop 1999 2000 2001 2002 2003 (1999-2003) 

Care and handling/ collections 

conservation (N=12) 319 351 231 267 424 1,592 

Book repair (N=6) 173 106 182 151 261 873 

Commercial binding (/V=2) 14 11 12 21 9 67 

Management of environment/ 

pest and mold control (N=6) 56 42 110 86 71 365 

Disaster planning/emergency 

management (W=12) 515 855 483 439 386 2,678 

Exhibits and security (N=5) 17 63 26 16 122 244 

Care of time-based 
(audiovisual) and visual 

materials (N=5) 67 26 193 165 120 571 

Reformatting and digitization 

(7V=4) 356 292 274 166 145 1,233 

Grant writing and fund-raising 

(N=3) 97 17 160 95 369 
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Attendance and participation in workshops about 
exhibits and security have some fluctuation, but interest 
remains strong. The diversity of collections presents institu- 
tions more opportunities to showcase their treasures and 
highlight a specific corpus of information amidst the greater 
body of work, yet the fragility and vulnerability of these 
materials requires archivists and librarians to learn how to 
exercise caution in presenting them. 

Audience 

Eight respondents gave information about the types of 
students who enrolled in their workshops (Q16). The 
investigators calculated the mean of reported percentages 
for the following categories: administrators (13.1 percent), 
supervisors or department heads (15 percent), entry-level 
professionals (22.5 percent), support staff (i.e., paraprofes- 
sionals, 30.6 percent), students (7 percent), volunteers (7.8 
percent), and others (4 percent). Other types of attendees 
noted by respondents included the general public and facili- 
ties staff. One respondent wrote in die margins of the survey 
instrument that the composition of the audience depends 
upon the topic of the workshop. While disaster preparation 
and recovery tended to draw administrators, supervisors, 
department heads, entry-level professionals, and support 
staff, the digitization workshops were composed of non- 
supervisory entry-level professionals, support staff, students, 
and faculty ("many of them senior faculty," a respondent 
reported). The high number of paraprofessionals and entry- 
level professionals (those segments of the audience comprise 
53.1 percent) suggests that these individuals are arriving on 
the job with little or no exposure to preservation concepts 
or experience with preservation work. In particular, for 
entry-level professionals, the significant number of MLIS 
graduates who have had minimal preservation education is 
particularly troubling. 

Reasons for Attending Preservation Workshops 

The reasons why attendees enroll in preservation workshops 
are varied. The 11 organizations offering data on this ques- 
tion (Q17) cited the following motives for enrollment: CE 
credits (3 organizations, 27.3 percent), workshop required 
for performing job duties (9 organizations, 81.8 percent), 
general interest in subject matter (9 organizations, 81.8 per- 
cent), and other reasons (4 organizations, 36.4 percent). The 
other reasons mentioned included: 

• part of degree program; 

• continuing education (no CE credits awarded); 

• new job responsibility; and 

• "course useful for understanding reasons behind 
techniques or work (for example, book repair, or 
introduction to XML)." 



From uSis data, investigators surmise that students are most 
likely to enroll when they are beginning a new job, have new 
job responsibilities, or when the workshop offers a hot topic 
such as digitization with which students feel they should be 
familiar-. The researchers also infer drat employees may be 
more likely to take workshops if their employer subsidizes 
the cost of enrollment, which may help to explain the high 
percentage of organizations reporting that enrollees cite 
general interest as a reason for taking classes. 

Credentials and CE Credits 

Among survey respondents, no CE providers offered a cre- 
dential in preservation or preservation management, aside 
from one program that is affiliated with an LIS school (Q4). 
Several providers do offer CE credits, however (3 respon- 
dents out of 13, or, 23.1 percent, grant credits) (Q5). The 
investigators suspect that public and school librarians tend 
to be most interested in CE credits, as most academic librar- 
ians do not have CE requirements. 

Faculty Resources 

The individuals who teach preservation in CE programs 
consist largely of professional conservators and preservation 
administrators (Q9). Many of these instructors work full- 
time or part-time for field service programs (comprising 
almost two-thirds of the total number of faculty), while the 
rest work as consultants for some of the smaller regional 
preservation alliances, and organizations that function large- 
ly on a volunteer basis. Just how many of the full-time and 
part-time staff members also "moonlight" as consultants for 
the smaller organizations is unknown, but anecdotal evi- 
dence suggests that the percentage of overlap between the 
two is significant. The investigators interpret the high num- 
ber of faculty who work full-time for these organizations 
(42, or 63.6 percent) as an indication of survey respondents' 
strong commitment to preservation education. 

Credentials of Educators 

Instructors of preservation workshops generally possess 
at minimum a professional-level master's degree (MLIS 
or equivalent); 42 of the instructors at the 13 responding 
organizations have such a background (Q10). Many of them 
also possess a post-master's degree certificate in conserva- 
tion or preservation administration (15 instructors). Ph.D.s 
teaching CE courses are a rarity; the lone Ph.D. reported 
in the survey was qualified in history, not library science. 
Ph.D.s serving as CE instructors are likely to remain scarce 
as few Ph.D. students are specializing in preservation at this 
time. In addition, many of the workshops focus on practi- 
cal day-to-day skills, with which many Ph.D.s may not be 
as familiar-. Respondents also cited extensive experience in 
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Table 3. 

N=13) 



conservation benchwork as a valued creden- 
tial. Other types of credentials mentioned (by 
9 instructors) included benchwork, a degree 
in museum studies, and an internship in a 
preservation department at an Association of 
Research Libraries (ARL) library. Because 
of the practical emphasis of many of the CE 
workshops, practitioners with significant tech- 
nical expertise and administrative experience 
appear to be the most desirable candidates for 
instructor positions. 

Hiring in Preservation CE 



The survey asked respondents to indicate 
whether or not they planned to hire addi- 
tional instructors to teach CE courses in 
preservation. Out of the 13 respondents to 
this question, 7 (53.8 percent) reported in the 
affirmative, while 6 (46.2 percent) said that 
they had no plans to hire additional staff at this time (Qll). 
Those who responded in the affirmative (and one respon- 
dent who had responded in the negative) indicated the 
following types of field service positions would be offered: 3 
organizations would like to hire a conservator on a full-time 
basis, 4 organizations would like to hire consultants on a 
contractual basis, and 1 organization would like to use more 
volunteers (Q12). Investigators interpret these data as a sign 
of positive growth for CE in the preservation arena. 

The Preservation Curriculum in CE 

As might be expected, workshops tend to be much more 
focused than graduate school courses, less theoretical, and 
oriented toward issues of practice and technique. Table 
4 summarizes the types of topics and formats covered in 
workshops offered by organizations that participated in the 
survey (Q6). Disaster recovery and control of environmental 
hazards have significant coverage in preservation education 
workshops. The data also show the continued importance 
of teaching preservation of paper-based media, book repair, 
enclosures and housing, and visual materials. While digi- 
tization, electronic media objects, audiovisual media, and 
electronic records have received some attention by CE 
providers, the primary focus of these workshops is still on 
the perennial preservation imperatives of books, paper, and 
photographs. 

Other topics and activities mentioned included "meta- 
data relating to digitization or preservation," "copyright as it 
relates to digitization," and "packing and shipping." Other 
formats mentioned included: 



Preservation faculty in continuing education (broken down by rank; 



Type of faculty 


Number of 
faculty 


Percentage of total 
number of faculty 


Full-time staff (conservation training) 


29 


43.9 


Full-time staff (preservation administration 
training) 


13 


19.7 


Part-time staff (conservation training) 


1 


1.5 


Part-time staff (preservation administration 
training) 


2 


3.0 


Consultants hired on contract basis 


20 


30.3 


Volunteers 


1 


1.5 


Total 


66 


100* 



Percentages do not add up to 100 due to rounding. 



• "all non-paper-based collections: ceramics, glass, 
metals, organic material, plastics, textiles, ptgs [paint- 
ings], etc."; 

• "paintings, ethnographic material (including Native 
American), art on paper, frames, polychrome sculp- 
ture"; 

• scrapbooks; and 

• archival material. 

Because a number of the organizations offer workshops 
in the conservation of cultural heritage objects, they cited 
various other formats that one may not consider to be part 
of the library or archival preservation agenda. Interestingly, 
conservation treatments are not often taught; this omission 
may be related to the distinction between activities that may 
be carried out by preservation administrators and support 
staff and those repairs and treatments that require the atten- 
tion of a trained conservator. 

Preservation Issues and Related Workshops 

Preservation also plays a part in other workshops in which it 
is not the main focus. In particular, workshops on archives 
and manuscripts, special collections, and collections man- 
agement are most likely to discuss preservation issues. Other 
workshop topics mentioned included rare books librarian- 
ship, digital libraries, technical services, and security (see 
table 5). Many organizations cited this question as "not 
applicable" because all of their workshop offerings focus on 
preservation (Q7). 

Survey participants were also asked to list workshops 
that included "preservation as a significant component 
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Table 4. Topics, activities, and formats covered in continuing 
education preservation workshops (/V=13) 



Topic or Format Covered 


Number of 
Providers 
Offering 


Number of 
Providers 
Not Offering 


(Topics) 

History and theory of conservation/ 
preservation 


J (Jo. JVC) 


5 (61.J7c) 


Ethics of conservation/preservation 


5 (38.5%) 


8 (61.5%) 


Conservation science (including 
materials deterioration) 


5 (38.5%) 


8 (61.5%) 


Book repair and rebinding 

(including 

hands-on practice) 


8 (61.5%) 


5 (38 5%} 


Conservation treatments 


3 (23.1%) 


10 (76.9%) 


Enclosures and housing 


8 (61.5%) 


5 (38.5%) 


Reformatting options 
(microfilming, photocopying, 
digitization) 


7 (53.8%) 


6 (46.2%) 


Control of environmental 
conditions (temperature, relative 
humidity, air quality, pest 
management) 


1U \ /o.y /c) 


j \ZdA /o ) 


Preservation assessment (surveying 
and policy recommendations) 


7 (53.8%) 


6 (46.2%) 


Management (personnel, fiscal, 
facilities) 


4 (30.8%) 


9 (69.2%) 


Emergency preparedness and 
disaster recovery 


12 (92.3%) 


1 (7.7%) 


OUIll clIlU UaCl LllLlL ill KH1 
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Other topics 


2 (15.4%) 


11 (84.6%) 


(Formats) 

Paper-based media (books and 
documents) 


12 (92.3%) 


1 (7.7%) 


Photographic media 


12 (92.3%) 


1 (7.7%) 


Visual materials 


7 (53.8%) 


6 (46.2%) 


Audiovisual media (sound 
recordings and moving images) 


7 (53.8%) 


6 (46.2%) 


Magnetic and optical media 
(removable storage media) 


6 (46.2%) 


7 (53.8%) 


Electronic records 


4 (30.8%) 


9 (69.2%) 


Digital library objects (both 
digitized and "born digital") 


5 (38.5%) 


8 (61.5%) 


Other formats 


4 (30.8%) 


9 (69.2%) 



(defined as spending at least 10 percent of workshop time 
speaking about preservation issues)" (Q8). Only 2 out of 
13 organizations reported such workshops, largely because 
many of these organizations only offer workshops in the 
area of preservation. Organizations that offer other types of 
workshops list the following classes as having a significant 
preservation component: 

• Introduction to Library Collections (30 percent of 
workshop); 

• "Digital topics" (10 percent of workshop); 

• Commercial Library Binding (60 percent of work- 
shop); and 

• Local History and Special Collections (30 percent of 
workshop). 

Plans for the Future 

Ten out of 13 respondents, or 76.9 percent, responded 
affirmatively to the question, "Does your institution plan to 
introduce new workshops in preservation in the near future 
(in the next 1^3 years)?" (Q13). Table 6 summarizes those 
subjects seen as potential new workshops (Q14, respondents 
could mark more than one choice). 

Respondents appear to be most interested in adding 
workshops to deal with photographic and other types of 
visual materials. Somewhat counterintuitively, interest in 
reformatting and digital preservation is weak, leading inves- 
tigators to wonder whether or not current offerings are seen 
as sufficient and meeting demand. Four providers indicated 
that they desired to add a collections conservation labora- 
tory class, which researchers interpret as a response to stu- 
dents' continuing demand for more hands-on opportunities. 
This data may also suggest that institutions have increasing 
interest in supporting in-house repair programs as part of 
a triage strategy (identifying and repairing minor damage 
early on, in hopes of increasing the number of times materi- 
als can be circulated). 

Other future workshops mentioned included the 
following: 

• "permanence and safety of artist materials"; 

• "writing a disaster plan," "disaster planning," "disaster 
response"; 

• "conducting building risk assessments"; 

• "collection care planning and management, conserva- 
tion/preservation planning, handling and housekeep- 
ing for collections, earthquake supports and mounts, 
protecting collections on display and in storage, inte- 
grated pest management"; 

• "environmental threats"; and 

• "designing conservation concerns into new buildings 
and additions." 
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As these topics indicate, providers are most interested in 
offering new workshops that target specific topics within the 
broader areas already defined. The nature of CE workshops, 
which rarely last longer than a day, encourages providers to 
narrow the focus and scope of programs. 

The ouher three organizations showed no interest in 
additional workshops in the area of preservation (Q15). 
Reasons cited included: 

• low enrollment in current offerings (1 respondent); 

• lack of available expertise to offer workshop (1 
respondent); 

• lack of fiscal resources (1 respondent); and 

• "[organization] will merge with [professional associa- 
tion] and preservation will become a component of 
their workshop offerings." 

Because of the small number of responses, identifying any 
sort of trend from this data is difficult, other than the fact 
that a lack of human and fiscal resources is slightly more 
likely to affect an organization's ability to offer new work- 
shops than other factors. 

Conclusion 

Data from this study supports the premise that CE is pick- 
ing up much of the slack that LIS programs are creating, 
offering programs on multiple topics not given sufficient 
coverage at the graduate level; additionally, CE courses 
often provide the only preservation education for parapro- 
fessionals and administrators who did not have the benefit 
of such a course in their graduate program. 

After examining the survey results, the investigators 
wonder whether it is problematic that CE providers often 
serve as the primary source for preservation education. 
When comparing the current state of preservation educa- 
tion to the circumstances that existed fifteen years ago, the 
research team sees little actual change over this period. The 
specificity of the programs and the brevity of the encoun- 
ters often hinder efforts to transition CE into the kind of 
educational experience envisioned by Cloonan and others, 
i.e., the opportunity to facilitate the sharing of knowledge 
through "experience, creativity, analysis, and the exchange 
of ideas." 20 Hence, CE should not be considered a substitute 
for graduate education, but ideally, a supplement that builds 
upon a foundation already laid by LIS programs, and a path 
towards specialization in preservation. The investigators 
suggest that CE providers and institutions consider explor- 
ing new avenues for providing the type of in-depth expe- 
rience introduced by the intensive models (for example, 
Rutgers and the other preservation management institutes 
offered in the past), but adapted to the online environment, 



Table 5. Preservation integrated into other workshops? (N=13) 



Other Workshops 


Yes 


No 


Archives and manuscripts 


3 (23.1%) 


10 (76.9%) 


Rare books librarianship 


1 (7.7%) 


12 (92.3%) 


Map librarianship 


(0%) 


13 (100%) 


Special collections 


4 (30.8%) 


9 (69.2%) 


Collections management/development 


4 (30.8%) 


9 (69.2%) 


Digital libraries 


2 (15.4%) 


11 (84.6%) 


Records management (including 
electronic records management) 


(0%) 


13 (100%) 


Technical services (including serials) 


2 (15.4%) 


11(84.6%) 


Other (Security) 


1 (7.7%) 


12 (92.3%) 


Not applicable 


7 (53.8%) 


6 (46.2%) 



Table 6. Interest in Expanding Preservation Curricula (A/=10) 



Workshop Topic Number of Respondents 

Introductory course in preservation history 1 

Collections conservation laboratory 4 

Reformatting 1 

Photographic media 4 

Visual materials 5 

Audiovisual media 2 

Digital preservation 1 

Other courses 8 



which would keep costs down and make them more acces- 
sible to students whose institutions could not support onsite 
attendance. Although not all topics lend themselves easily to 
the online environment, digital preservation is an area that 
seems particularly suited to this model. 

The investigators found that the data generated from 
this study answered many of the questions raised about the 
"who, what, when, and where" of CE in preservation, but 
did not sufficiently capture the underlying explanations of 
certain phenomena. Questions that remain unanswered 
include: 

• Is growth in CE driven more by demand or by the 
availability of government subsidies of both provider 
programs and enrollment in those programs? What 
happens to CE programs if government funding is 
severely curtailed or eliminated — will employing 
institutions assume the full costs of providing CE 
opportunities to their employees? 
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• How do graduate LIS curricula influence CE cur- 
ricula, and vice-versa? Although CE programs can 
be more agile in offering new topics, in areas such 
as audiovisual and visual materials, to what level of 
complexity can CE aspire, given the brief nature of 
most workshops? 

The investigators feel that these questions are best 
addressed using another methodological approach, ideally a 
qualitative one. Thus this study represents the first phase of 
a larger research project. Building upon the initial results of 
the survey, the investigators hope to follow up with in-depth 
interviewing of key informants involved in preservation 
CE at selected sites. After analyzing the interview data and 
comparing those results to those of the survey, the investi- 
gators hope to have a more complete picture of the state of 
preservation CE in the United States, which will be used to 
create recommendations for directing preservation CE in 
the next decade. 
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Appendix: Survey Instrument 

Preservation Education Needs for the Next Generation of Information 

Professionals 

Survey for Educators Teaching Preservation in Field Service Programs and Other 
Providers of Continuing Education for Preservation 

Types of Courses/F requency Offered 

1. Does your organization offer workshops on preservation and/or conservation of library/ 
archival materials? 

Yes (go to next question) 

No (go to question 18) 

2. How many workshops do you offer on preservation of library/archival materials? Do not 
include courses that merely incorporate preservation as part of a related topic (such as 
archives or collection development) unless preservation issues constitute at least one- 
third of the material covered. 

1 

2 

3 

More than 3 

3. List each preservation course offered, and indicate the regularity with which it is offered. 
Also indicate its enrollment over the last five years, broken down by years. 



Course Title 


Frequency 


Enrollment over the 
Last Five Years 






2003 
2002 
2001 
2000 
1999 








2003 
2002 
2001 
2000 
1999 








2003 
2002 
2001 
2000 
1999 








2003 
2002 
2001 
2000 
1999 








2003 
2002 
2001 
2000 
1999 
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4. Does your organization offer a credential in preservation? 
Yes 

No 

5. Does your organization offer continuing education credits? 
Yes 

No 

Content of Preservation/Conservation Coursework 

6. What topics are covered in preservation coursework? Check all that apply. 
History and theory of conservation/preservation 

Ediics of conservation/preservation 

Conservation science (including materials deterioration) 

Activities: 

Book repair and rebinding (including hands-on practice) 

Conservation treatments 

Enclosures and housing 

Reformatting options (microfilming, photocopying, digitization) 

Control of environmental conditions (temperature, relative humidity, air quality, 

pest management) 

Preservation assessment (surveying and policy recommendations) 

Management (personnel, fiscal, facilities) 

Emergency preparedness and disaster recovery 

Staff and user education 

Other: 

Formats: 

Paper-based media (books and documents) 

Photographic media 

Visual materials (architectural drawings, maps, prints) 

Audiovisual media (sound recordings and moving images) 

Magnetic and optical media (removable storage media) 

Electronic records 

Digital library objects (both digitized and "born digital") 

Other: 

Related Coursework 

7. How do you incorporate preservation into other workshops? Please check all that apply. 
Archives and manuscripts 

Rare books librarianship 

Map librarianship 

Special collections 

Collections management/development 

Digital libraries 

Records management (including electronic records management) 
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Technical services (including serials) 

Other: 

Not applicable 

8. Please list any related courses that include preservation as a significant component 
(defined as spending at least 10 percent of workshop time speaking about preservation 
issues). 



Course Title 


Percentage of Course Devoted 
to Preservation Issues 















9. Who teaches preservation workshops for your organization? Fill in the blanks with the 
number of instructors. Do not count instructors who merely incorporate preservation as 
part of a related topic (such as technical services). 

Full-time staff with conservation training and experience 

Full-time staff with preservation administration training and experience 

Part-time staff with conservation training and experience 

Part-time staff with preservation administration training and experience 

Consultants (hired on a contractual basis to teach particular courses) 

Volunteers 

10. How many faculty members hold: 

A professional-level master's degree? 

A certificate of advanced study in conservation or preservation? 

A Ph.D. degree? 

Another degree or certification (please list types: )? 



11. Do you have any plans to hire additional staff or recruit volunteers to teach in the area of 
preservation/conservation? 

Yes (go to next question) 

No (go to question 13) 

12. If yes, what type(s) of position(s) would be offered? Fill in the blanks with the number 
of positions. 

Full-time staff position for conservator 

Full-time staff position for preservation administrator 

Part-time staff position for conservator 

Part-time staff position for preservation administrator 

Consultant (hired on a contractual basis to teach particular courses) 

Volunteer work 

13. Does your institution plan to introduce new workshops in preservation in the near future 
(in the next 1-3 years)? 

Yes (go to next question) 

No (go to question 15) 
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14. If yes, please list what type(s) of course(s) will be offered and when you hope to offer it 
(them): 

Year Type of Course 

Introductory course in preservation history, theory, science, etc. 

Collections conservation laboratory experience (book repair, rebinding, 

deacidification, other treatments) 
Reformatting (microfilming, copying, digitization) 

Specialized preservation seminars in: 
Photographic media 

Visual materials (architectural drawings, maps, prints, etc.) 

Audiovisual media (sound recordings, moving images) 

Digital preservation (electronic records and other digital media) 

Other: 

Go to question 16. 

15. If no, why not? Check all that apply. 

Low enrollment in current preservation offerings 

Low enrollment in past preservation offerings 

Preservation felt to be discussed sufficiently in other workshops on related topics 

(e.g., technical services, collection development) 

Lack of available expertise to offer workshop 

Lack of fiscal resources 

Other: 



Audience 

16. Please estimate average percentages of students who enroll in coursework: 
Administrators 

Supervisors or department heads 

Entry-level professionals 

Support staff (paraprofessionals) 

Students 

Volunteers 

Other: 

17. What reasons do attendees give for enrolling in your courses? Check all that apply. 
Continuing education credits 

Course required for performing job duties 

General interest 

Other: 
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Future Participation in This Study of Preservation Education Needs 

18. May the investigators of this study contact you or a representative of your institution 
again about participating in the next phase of this study? Please check the appropriate 
box below with your preference and include contact information if requested. 

No, I am not interested in further participation. Please do not contact me again. 

Yes, I (or a representative of my institution) would be interested in further 

participation. Please contact at the following 

address, phone number, and/or e-mail: 



Phone: E-Mail: 



Thank you for participating in this survey! Any further questions or comments may be directed 
to Dr. Karen F. Gracy (kgracy@pitt.edu) or Ms. Jean Ann Croft (jeanann@pitt.edu). 
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DACS and RDA 

Insights and Questions 
from the New Archival 
Descriptive Standard 

By Beth M. Whittaker 

Describing Archives: A Content Standard (DACS) is the new archival content 
standard published by the Society of American Archivists (SAA). The publication 
of this forward-thinking and comprehensive response to changing information 
needs and technologies should be of interest to all cataloging communities. DACS 
raises issues about content standards for resource description that should be 
addressed, much more broadly. The library cataloging community is in the pro- 
cess of an extensive revision of its cataloging codes, and new approaches in this 
standard appear to be embodying some of the same concepts as DACS. DACS, 
therefore, can be seen as a smaller and more focused implementation of some 
of the principles that will emerge in the new Resource Description and Access 
(RDA). Simultaneously, the standard can be used to examine whether taking 
some of these developments further would improve access to materials. 

Describing Archives: A Content Standard (DACS) is the new archival con- 
tent standard published by the Society of American Archivists (SAA). 1 
Not simply an updated manual for cataloging archives, it is a forward-thinking 
and comprehensive response to changing information needs and technologies. 
Although a relatively recent publication, DACS has already generated discus- 
sion in the archival community. DACS raises issues about content standards for 
resource description that should be addressed beyond the archival community, 
as well. As die library cataloging community is in the process of an extensive 
revision of its cataloging codes, DACS can be seen as a smaller and more 
focused implementation of some of the principles that will emerge in the new 
Resource Description and Access (RDA), which will replace the Anglo-American 
Cataloguing Rules (AACR). 
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is Head, Special Collections Cataloging, 
The Ohio State University, Columbus. 
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for publication April 6, 2006, pending 
revision; revision submitted May 17, 
2006, and accepted for publication. 



Archival Description and Library Cataloging 

In order to understand how innovative DACS truly is, surveying the context from 
which it emerged is necessary. This paper will not provide a detailed history of 
archival cataloging, although general sources are available to do so. 2 Since DACS 
owes its structure to the characteristics of archival material, a few points are 
worth mentioning, particularly historic milestones in archival content standards 
and cataloging codes. 

One of the most prominent features of archival material (from a cataloging 
point of view) is the lack of a chief source of information. Kiesling has called 
archives a "non-transcription community," while books and serials catalogers 
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form a "transcription community," in which bibliographic 
descriptions are based largely on transcription of informa- 
tion on items at hand. 3 OuSer non-transcription communi- 
ties are becoming more interested in exploring the role of 
their descriptive information in a more bibliographic con- 
text. In this way archivists can serve as a model for film and 
video catalogers, computer files catalogers, museum objects 
catalogers, and others. 

Another prominent feature of archival description is 
the relationship among several types of abstracts of collec- 
tions: standard bibliographic records, finding aids, invento- 
ries, and so on. A one-to-one correspondence between the 
record and the "thing" being cataloged is not present. By the 
time descriptions of huge archival collections are recorded 
in bibliographic records, much information has been lost 
due to system restrictions and descriptive conventions. In 
observing this hierarchy of metadata in 1995, Hensen wrote, 
"It is absurd to imagine that the conventions of author-title 
cataloging with two or three subject headings could even 
begin to capture the complexity of most archival materials 
(even if they had authors and titles.)" 4 This perception of the 
limitations of library cataloging to describe archival materi- 
als heavily influenced the development of DACS. 

Prior to 1967, rules for manuscript cataloging did 
not appear in library cataloging manuals at all. Choice of 
entry for manuscripts was addressed in the 1949 A.L.A. 
Cataloging Rules for Author and Title Entries, but no guid- 
ance for description was given. 5 The 1967 Anglo-American 
Cataloguing Rules (AACR1) introduced rules for describ- 
ing both individual manuscripts (200-204) and collections 
(205-207). 6 Anglo-American Cataloguing Rules, 2nd ed. 
(AACR2) deviated from AACRl's approach.' This edition 
created rules in chapter 4 for cataloging manuscripts that 
are have been characterized as "not archival." 8 

Archives, Personal Papers, and Manuscripts (APPM) 
was a response from the archival community to AACR2, 
which was seen as inadequate for modern manuscript and 
archival description. 9 APPM demonstrated that "the system 
of library-based cataloging techniques embodied in the sec- 
ond edition of Anglo-American Cataloguing Rules (AACR 
2) could be adapted to serve the needs of the archival com- 
munity." 10 In this way, it filled a niche for archives similar to 
other format-specific implementations of AACR. 

In recent years, two major developments affect- 
ing archival description have emerged: the International 
Council on Archives' General International Standard 
Archival Description (ISAD(G)) and International Standard 
Archival Authority Record for Corporate Bodies, Persons, 
and Families (ISAAR (CPF)). 11 Just as the Anglo-American 
cataloging community interprets the larger International 
Standard Bibliographic Description (ISBDJ framework, 
American archival cataloging rules have attempted to 
respond to changes in the international ISAD(G). ISAD(G) 



might be seen as an archival Dublin Core set of descriptive 
elements. These core elements can be used at any level of 
description (e.g., folder or series) 

Attempts to create a joint descriptive standard for 
the American and Canadian archival communities and 
to accommodate international standards ISAD(G) and 
ISAAR(CPF) reached a state of hopeful optimism. Although 
there was not enough common ground between American 
and Canadian archivists to create joint content standards, 
"the dialogue between Canadian and U.S. archivists will 
surely continue." 12 In the meantime, DACS corresponds 
very closely to the elements of ISAD(G) and ISAAR(CPF) 
with only one element excluded. The Level of Description 
element is excluded based on the acknowledgement that no 
consensus exists on how to apply terminology for more than 
five levels of description, and that recording such complexity 
does not in itself link multilevel descriptions. 13 

DACS, like APPM before it, serves as a replacement 
for the skeletal rules in AACR2 chapter 4 for cataloging 
manuscripts, but makes conscious departures from AACR 
tradition in some ways. It "provides more specific guidance 
in the description of contemporary archival materials and 
eliminates some of the less user-friendly aspects of AACR2, 
including many abbreviations and the coded recording of 
uncertain dates, conventions necessitated by the space limi- 
tations of 3 x 5 catalog cards but no longer helpful or neces- 
sary in modern information systems." 14 Eliminating these 
less user-friendly aspects may pose the greatest challenge to 
our thinking about cataloging rules. 

Structure of DACS 

DACS begins with a "Statement of Principles," a "recapitula- 
tion of generally accepted archival principles." 15 This section 
recaps essential ways in which describing archival materials 
may differ from describing library materials, particularly in 
fundamental areas such as respect desfonds, the relationships 
between arrangement and description, and the description 
of creators. Next is an "Overview of Archival Description," 
which outlines both Access Tools such as MARC 21 and 
Encoded Archival Description (EAD) finding aids, as well as 
Access Points that should be provided. 

"Part I: Describing Archival Materials" includes "rules 
to ensure die creation of consistent, appropriate, and self- 
explanatory descriptions of archival material." 16 "Part II: 
Describing Creators" offers a uniquely archival perspective. 
Naming creators is not sufficient. "Additional information 
is required regarding the persons, families, and corporate 
bodies responsible for the creation, assembly, accumulation, 
and/or maintenance and use of archival materials being 
described." 1 ' This indicates the importance of context in 
archival description. 
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"Part III: Forms of Names" consists of "information 
about creating standardized forms for the names of persons, 
families, or corporate bodies associated with archival materi- 
als ... . These can be used in descriptive elements, archival 
authority records, or as index terms." 18 Finally, DACS con- 
cludes with appendixes, a glossary, a list of companion stan- 
dards, crosswalks, and full EAD and MARC 21 examples. 

DACS, AACR2, and RDA 

At the time of DACS's publication, its departures from 
AACR2 were nearly revolutionary. In summing up the 
changes in archival cataloging practices brought about by 
the possibilities of EAD-encoded finding aids and their 
relationship to cataloging, Hensen suggested that new 
cataloging paradigms had not yet emerged. Referring to 
the promise of revolutionary bibliographic control at the 
International Conference on the Principles and Future 
Direction of AACR convened in 1997, he believed the 

inertia inherent in existing catalogs of millions 
upon millions of bibliographic records is sufficient 
to discourage most library bureaucrats and admin- 
istrators from undertaking massive and systematic 
changes — particularly in an environment that is 
itself so volatile as to defy reasonable calculation. 
. . . [The] archival community . . . concluded that 
it must proceed on its own, while the library world 
may yet move more decisively. 19 

In the last few years, the ongoing process of develop- 
ment of new cataloging standards for mainstream materials 
has revealed more obvious parallels between DACS and the 
emerging successor to AACR2. The prospectus for RDA 
illustrates clearly that some of the major issues articulated 
in DACS are being considered within the library cataloging 
community as well. 20 

Prominent among them is that these rules should be 
based on principles, should cover all types of materials, 
should be easy to use and interpret, and "will be used as a 
resource beyond the library community to facilitate meta- 
data interoperability." 21 This broadening of the scope of 
AACR underscores the emerging Web-format world. Also 
important is the statement that "the language needs to be 
clearer and more direct, and that library jargon should be 
avoided." 22 

In keeping with the idea that RDA is marketed more 
towards metadata communities beyond libraries, rules will 
be structured "to facilitate application to a wide variety of 
resources" with general instructions that are "formulated in 
clear, concise, and simple terms," supplemented with more 
detailed instructions applicable to complicated situations. 23 



In addition, the standard will encompass a "general move- 
ment towards simplification and an emphasis on principle- 
based cataloger's judgment." 24 Another point of similarity is 
that RDA "establishes a clear line of separation between the 
recording of data and the presentation of data." 2 

RDA's three-part structure seems to also closely paral- 
lel that of DACS, with the first part focusing on resource 
description. The second will cover the provision of access 
points for "relationships" and the third covering the formu- 
lation of name and title access points and other data used for 
authority control. 26 

The development of format-specific rales for archives 
and manuscripts within the context of RDA also merits 
mention. The Library of Congress (LC) and SAA have both 
responded to proposed archival rules to supersede AACR2 
chapter 4 in RDA. While the future integration of these 
comments and DACS's format-specific rales into RDA 
remains unclear, the standards will likely continue to overlap 
to some degree. 2 ' 

Major Issues Addressed in DACS 
Output Neutrality 

The output neutrality of DACS underscores a major ques- 
tion for the cataloging community at large. Is it necessary for 
cataloging standards, which have existed in a MARC-based 
world for at least twenty years (and a card-based world for 
much longer) to become output neutral? In fact, MARC 
records are simply manifestations of descriptions that could 
be output in any number of ways. For archival material, lon- 
ger, more complex descriptions can be created and coded as 
instances of EAD finding aids, which is why DACS provides 
examples to accompany its guidelines in both MARC and 
EAD formats. 

Catalogers do not need to be convinced of the value of 
standardization. Digital projects describing images at the 
item level, for example, may use part of our descriptive con- 
ventions in formulating name headings, and bibliographic 
descriptions themselves have been exposed to a larger 
audience (and divorced from the context of the catalog) 
through the Open WorldCat project. 2S Since data exchange 
formats could change, the future needs of the archives com- 
munity could continue to be served by DACS descriptions 
in an increasingly mapped and cross-walked environment. 
Descriptions (or parts of descriptions) coded in an XML for- 
mat (such as EAD) are potentially reusable in limitless ways. 

This bifurcation of content and carrier appears to 
be the direction being taken by RDA. The Joint Steering 
Committee for Revision of AACR states that "what is being 
developed is in effect a new standard for resource descrip- 
tion and access, designed for the digital world" and that the 
new approach for RDA will have "instructions for recording 
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data [that] will be presented independently of guidelines for 
data presentation." 29 

This major change likely will be more difficult to imple- 
ment in a library world wedded to forms of display derived 
from catalog records than in the archival world, accustomed 
to many different forms of description. For example, how 
many catalogers still spend time "upgrading" records while 
copy cataloging by changing punctuation to conform to 
ISBD conventions? While this is nearly instinctive behavior 
among many catalogers, the content may remain essentially 
the same but time and energy is being spent on adapting 
the carrier. 

Content versus Context 

Closely related to output neutrality is the separation of 
descriptive content from historical or biographical context. 
In the cataloging world, these two factors have been closely 
linked. For example, although authority records reside in 
library catalogs, they provide context for understanding 
name headings, rather than describing materials created 
by the entities represented in the authority records them- 
selves. The increasingly common use of library authority 
files (particularly the LC Name Authority File) for nonli- 
brary cataloging indicates a potential need to broaden their 
usefulness. Tillett asserted, "as we open our authority files 
for access through the Internet, we find the authority file 
becoming a useful tool for other librarians and information 
professionals and even end-users." 30 

How much more might this be the case in the archi- 
val world, where archivists who maintain official files are 
often uhe acknowledged experts on a particular person or 
organization? Although not explicitly mentioned in DACS, 
the creation of a parallel structure for creator information 
to EAD, called Encoded Archival Context, is worth exami- 
nation. 31 Archives have traditionally maintained extensive 
supplemental documentation on creators, necessary to 
fulfill their missions, particularly when the creators have a 
relationship with the archives themselves (such as in institu- 
tional archives.) DACS explicitly separates these two types 
of information in theory, with the potential to allow other 
users to benefit from this information in a variety of ways, 
rather uhan simply serving as a reference for librarians and 
archivists. Users with systems that combine these types of 
records can continue to create functional descriptions. 

Levels of Description and Data Elements 

The existence of levels of description in archival practice 
is a central factor in DACS, meriting a brief but important 
first chapter. Haworth has argued that "given its hierarchical 
structure, archival description presents complex challenges 



that the MARC data structure was never designed to accom- 
modate." 32 This complexity of relationships is not unique; 
museum collections, digital projects, and other emergent 
communities have similar, if not identical issues. In cultural- 
heritage communities, descriptions of collections are often 
as — if not more — important to users than are descriptions 
of individual items, since the presence of an item within a 
larger collection often conveys important information about 
its provenance and use. 

Although many catalogers (and perhaps most non- 
catalogers) think of the MARC structure as flat, AACK2 
did articulate levels of description; MARC has developed to 
accommodate relationships among these levels, most nota- 
bly with linking fields and series tracings. These mechanisms 
are often difficult to exploit in library systems, but they exist. 
The widespread inclusion of table of contents information 
in MARC records, for example, has changed the nature 
of the relationship between the piece and the record and 
opened the possibility of a network of relationships among 
descriptions. The inherent relationships among serials, 
which merge, cease, resume, and split off from one another, 
highlight another area where complexity built into MARC 
could be illustrated better in catalog records. Outside the 
MARC world, links between digital files, such as images 
and the metadata describing them within a database, show 
additional possibilities to highlight these relationships. The 
importance of levels of description successfully articu- 
lated by DACS for archival material should encourage us to 
explore this concept in other types of materials as well. 

The Functional Requirements for Bibliographic Records 
(FRBR) model also will be on the minds of catalogers exam- 
ining the new standard. This is particularly interesting as 
it points out parallels between "levels of description" and 
the FRBR model. For example, if collections are treated 
as works, what is the role of FRBR in archival descriptions 
of archival series or even items? 33 Can individual letters be 
seen as manifestations of the content of a larger collection? 

Another bold statement that appears, at first, to con- 
tradict existing MARC structure is DACS's assertion that 
data elements are mutually exclusive — "The purpose and 
scope of each element has been defined so that the pre- 
scribed information can go in one place only." 34 How would 
this principle be applied in a MARC universe, particularly 
where catalogers have often deliberately replicated informa- 
tion from coded fixed fields in narrative variable fields in an 
attempt to overcome limitations of library systems? Perhaps 
restricting information to one place only would force the 
issue of displaying now-invisible content hidden in coded 
strings (such as 007 fields.) An approach more consistent 
with die spirit of DACS might call instead for standardizing 
such information in eye-readable fields in ways that are 
immediately comprehensible to users. 
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Abbreviations 

This spirit of user-friendliness is very prominent in DACS's 
recommendations rejecting standard abbreviations. Specific 
examples include the extent element (2.5) where a note 
explains, "It is recommended, diough not required, that 
terms reflecting physical extent be spelled out rather than 
abbreviated, as abbreviations may not be understood by all 
users." 35 The emphasis on the user is one of DACS's more 
controversial recommendations . 

When considering the amount of time spent to type 
"feet" versus "ft.," for example, enhancing clarity for a 
variety of users perhaps not fluent in English and very 
likely unfamiliar with jargon is worth a sacrifice of a few key- 
strokes. Depending on the system used for creating DACS- 
compliant descriptions, abbreviations could be expanded 
automatically, in much the same way that some integrated 
library systems expand relator codes into relator terms 
between MARC records and public displays. In rejecting 
a holdover from a paper-based descriptive environment, 
DACS is pushing the envelope in a way that could be revo- 
lutionary if applied more broadly. 

Creatorship and Name Headings 

DACS takes a different approach to authorship than AACR2, 
defining "creator" as "a person, family, or corporate body 
that created, assembled, accumulated, and/or maintained 
and used records in the conduct of personal or corporate 
activity. A creator can also be responsible for the intellectual 
content of a single item." j6 AACR2 does not define a creator 
at all, but instead defines personal author as "the person 
chiefly responsible for the creation of die intellectual or 
artistic content of a work," along with specific functions like 
"editor," "producer," and "collaborator." 3 ' Rules in AACR2 
chapter 21 also detail concepts of shared responsibility and 
mixed responsibility. Despite this sophistication, even expe- 
rienced catalogers sometimes have trouble determining how 
to apply these rules in complex situations. 

One example highlights the difficulty of applying these 
concepts in die current bibliographic context. Though an 
individual could be a "personal author" for a blog, the con- 
tent linked from die author's comments on news articles 
complicates the authorship to a mind-boggling degree. A 
blogger may be a creator, but — according to AACR2 ter- 
minology — is probably not an author. This complexity of 
creatorship is present in other formats as well, although 
mainstream cataloging practice has tended to try to fit these 
formats into a bibliocentric box, with detailed rules for 
determining chief responsibility even for works widi com- 
plex creatorship. 

One of many frustrations wrought for catalogers by die 
specificity of the MARC format is the distinction between 



creators as names and as subjects. Depending on a library 
system's indexing rules, as well as local indexing decisions 
often driven by cost, creators of collections may need to be 
indexed twice, as both 6xx (subject) and 7xx (name) fields, in 
order to ensure users will be able to locate relevant material 
however they search. This leads to duplication that in itself 
can sometimes be misleading. Cataloging rules continue to 
appear needlessly complicated to the outside world. 

One way in which these distinctions between "author" 
and "subject" headings have been acutely confusing is the 
use of family names. AACR2 does not allow for describ- 
ing families as "authors," yet "the use of family names as 
creators in the description of archives was part of previous 
bibliographic cataloging codes, has a long tradition in archi- 
val descriptive practice, and has been officially sanctioned 
at least since the first edition of APPM was published by the 
Library of Congress in 1983. DACS makes this explicit in 
12.29A, calling for the addition of the word "family" to the 
family surname. 39 Although this raises the question of how 
DACS-based records would function in a MARC catalog of 
AACR records, library cataloging guidelines also are moving 
in this direction. 

A final challenge to traditional cataloging practice is 
hinted at in DACS's treatment of name headings, a chal- 
lenge that may deserve to be taken up much more broadly. 
Is including detailed and often confusing rules about how 
to form name headings in each cataloging code necessary? 
Could one simply point creators of descriptions directly to 
the (de facto) authority file, and provide abbreviated guid- 
ance about forming headings when catalogers encounter 
names that are not in the authority file? DACS begins the 
process of removing specialist names from its basic content 
standard with the reference to AACR rules to create Islamic 

40 

names. 

Artificial Collections 

Finally, one of the major differences between DACS and 
earlier archival cataloging standards is the elimination of 
the concept of the "artificial collection." "Materials that 
are gathered together by a person, family, or organization 
irrespective of their provenance are intentionally and con- 
sciously assembled for some purpose. Most repositories in 
the U.S. have such collections, and they need to be handled 
and described the same way as materials traditionally con- 
sidered to be 'organic.'" 41 In addition to standardizing the 
way archival collections are described, this development has 
a potentially interesting implication for handling non-archi- 
val material, as well. Recent national efforts to reduce back- 
logs in special collections, for example, have often called for 
greater use of collection-level records for materials such as 
books, maps, or pamphlets. The forthcoming edition of the 
new descriptive rules for rare books include an appendix 
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on collection-level cataloging, which bridges an uncomfort- 
able gap between the transcription and non-transcription 
approaches. 42 

Areas for Further Exploration 

While DACS and RDA both seem revolutionary in many 
respects, perhaps some of these suggestions have not 
been taken far enough. If a drive to simplify records and 
tailor resource description to both users and the materials 
themselves are noble goals, several areas could be further 
developed. Although none of these suggestions are novel 
and provocative, and authors have proposed many of them in 
the literature before, the emergence of new codes provides 
another opportunity to raise die questions. It also allows some 
context for examining how major changes might be made. 

First among these seems to be abbreviations. Separating 
the content of a bibliographic description from its format 
finally divorces, at least in theory, the description from the 
legacy of the catalog card. Many abbreviations continue to 
persist from that legacy. What is the reason, for example, 
to insist on abbreviations such as "ca." before dates, when 
other, fuller syntax might make the point much more clearly 
to a universal audience? 

RDA promises to "minimize the need for retrospective 
adjustments when integrating data produced using RDA 
into existing files." 43 This is also the case with DACS, which 
should cause very little conflict between descriptions cre- 
ated using it and APPM, for example. In the major source 
of potential conflict, family names, the Anglo-American 
cataloging community could learn from the specialists in 
archives. For example, even if RDA does not adopt the user- 
friendly recommendations on abbreviations, records will be 
no more difficult to interpret than those records created 
using pre-AACR rules and punctuation conventions that 
exist in our combined catalogs to this day. 

Another major opportunity is to use DACS as a spring- 
board to examine all aspects of archival description, from 
initial processing documentation to final finding aids and 
catalog records. Particularly in those environments where 
these functional tasks are undertaken by different people, 
DACS can provide a common ground for archivists, cata- 
logers, and other personnel to look for efficiencies and 
improvements in the process, an area that some in the pro- 
fession have identified as a pressing need. 44 

The authority work required by both libraries and 
archives might benefit from a more collaborative approach, 
as well. Would maintaining an authorized heading be pos- 
sible in a wiki-like environment, allowing any institution 
to contribute additional information or references as they 
see fit? This is already present in the popular environment, 
where hyperlinks to explanatory materials often point read- 



ers to Wikipedia as an authoritative source. 43 This allows 
readers unfamiliar with a topic or concept to be introduced 
to further information without interrupting the narrative 
flow of the text. It also might lead to greater standardization 
simply through forcing the blogger to consider the relation- 
ship between the term as used and the term as "authorized" 
in Wikipedia as the link is constructed. The same principle 
might work well with the kinds of historical or biographical 
contexts provided for names and even subjects in resource 
descriptions. Particularly among specialized communities, 
this decentralized approach might be more beneficial than 
limiting references based on the constraints of our old 
library systems, and would leverage subject expertise where 
needed. 

Another area where such cooperative authority work 
might benefit both users and libraries is in the realm of 
serial title changes. Although DACS proposes no such thing, 
a broad interpretation of the rules for recording administra- 
tive structure, predecessor and successor bodies, and names 
of corporate bodies might allow such context, removed from 
the heading, to serve as an innovative way to handle serial 
title changes. For example, if long narratives of administra- 
tive histories were provided outside the context of resource 
catalogs, including references contributed cooperatively 
for varying names and titles, with a single entry point for 
the serial itself, the function of a serial title name might be 
served without ongoing maintenance currently required by 
current cataloging rules. 

The final, and perhaps most challenging, development 
might be to take simplification of creator heading rules 
furdrer. For example, AACR2 currently devotes the bulk 
of chapter 22 to the "exceptions" — headings that are not 
commonly encountered in most libraries and archives in the 
English-speaking world. They are even called "Special Rules 
for Names in Certain Languages," a title that acknowledges 
just how obscure these headings are. Entire sections are 
devoted to Indonesian and Malay names, which are so 
complex that even the detail found in these rules cannot 
clarify them for an audience with no knowledge of these 
languages. Since catalogers working with large collections of 
Malay materials are likely to have greater knowledge about 
the formation of these names, as well as reference sources 
not available to average librarians, cataloging codes could be 
simplified and shortened tremendously by removing these 
rules entirely and pointing people who need to formulate 
these headings to another source. 

This would have several benefits. The code itself would 
be shorter and underlying principles would be more appar- 
ent, leading to better-developed cataloger judgment. The 
perception of complexity that is often seen as a reason not 
to create descriptions using AACR-type rules might be miti- 
gated. Finally, the disconnect between subject expert usage 
and cataloger usage that has plagued library history (most 
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recently with the romanization of Chinese characters) pos- 
sibly could be avoided. 

Conclusion 

DACS has foreshadowed RDA in transforming description 
of cultural heritage materials for an Anglo-American world. 
Many of its innovations, such as separating content from 
carrier and content from context, are being incorporated 
in the revision of library standards. Others, such as reduc- 
ing or eliminating the use of abbreviations, may be more 
controversial in the larger library community. Nonetheless, 
catalogers not familiar with archives would do well to think 
about how archival materials mirror in many ways the types 
of materials they increasingly are being expected to orga- 
nize for retrieval. The parallels are not exact, but they are 
informative. 

The impact of DACS at this time is limited to the 
archival community in the United States, since it is an 
SAA standard. Just as harmonization between AACR and 
other non-English speaking standards has been difficult to 
achieve due to differing descriptive traditions, the efforts to 
address standards for archives across the world will prove 
as frustratingly complex. Unlike tire MARC environment, 
where catalogers are largely dependent on bibliographic 
utilities, archivists retain a high degree of control over their 
own descriptive records, making compliance difficult, if not 
impossible, to ensure. DACS attempts to address this prob- 
lem through flexibility, but that same flexibility may lead to 
a high degree of non-standardization, even when archivists 
and catalogers are attempting to follow its guidelines. The 
legacy of archival description residing in other systems, such 
as paper finding aids, card files, or even databases, must be 
addressed. 

This leads to one last question that must be asked 
about the future of all descriptive standards in the cultural 
heritage community: why should other communities care? 
Certainly the profession has been successful at standard- 
izing bibliographic description of books and serials to a high 
degree, even across the English-speaking world. Other types 
of materials have remained segregated within systems that 
seem to work for them. Even communities such as muse- 
ums, which often share libraries' emphasis on standard- 
ized vocabulary for descriptive fields (such as terms from 
the Art {? Architecture Thesaurus) may not see a need to 
adopt more library-like practices for their entire descrip- 
tive framework, despite the best intentions of the drafters 
of RDA. We must ask ourselves what we are offering these 
other communities before attempting to create a standard 
that we hope they may want to use. 

Any effort to revise descriptive standards must balance 
the historical value and proven results of our rules with the 



promise of the future. DACS succeeds in doing this for 
archival materials, while still retaining a refreshing simplic- 
ity and brevity. We might hope descriptive standards for 
library materials could achieve the same. 
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Mapping WorldCat's 
Digital Landscape 

By Brian F. Lavoie, Lynn Silipigni Connaway, 
and Edward T. O'Neill 

Digital materials are reshaping library collections and, by extension, tradi- 
tional library practice for collecting, organizing, and preserving information. 
This paper uses OCLC's WorldCat bibliographic database as a data source for 
examining questions relating to digital materials in library collections, including 
criteria for identifying digital materials algorithmically in MARC21 records; 
the quantity, types, characteristics, and holdings patterns of digital materials 
cataloged in WorldCat; and trends in WorldCat cataloging activity for digital 
materials over time. Issues pertaining to cataloging practice for digital materials 
and perspectives on digital holdings at the work level also are discussed. Analysis 
of the aggregate collection represented by the combined digital holdings in 
WorldCat affords a high-level perspective on historical patterns, suggests future 
trends, and supplies useful intelligence with which to inform decision making in 
a variety of areas. 
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Introduction 

Print books have been the traditional focus of library collections; indeed, the 
word library itself originates from the Latin word for book, liber. Over time, 
library collections have diversified to embrace a variety of information resources, 
such as scholarly journals, photographs, microfilm, and videotapes (the authors 
note that a Columbus-area public library even circulates artwork to its users). But 
after print books, one may argue that digital materials have made the greatest 
impact on the nature and shape of library collections. The reverberations of this 
impact are still being felt and the long-term consequences for traditional print 
book collections are yet determined. 

Digital materials are shifting long settled library practice for collecting, 
organizing, and preserving information. Libraries have been challenged with the 
need to collect and manage new types of materials (for example, software and 
Web sites), as well as new forms of traditional materials (for example, electronic 
books and electronic journals). The established custodial role of libraries has 
been overturned by the growth in digital content obtained through license or 
subscription rather than direct acquisition. Simultaneously, companies such as 
Amazon and Google are making inroads into traditional library services all along 
the discovery-to-delivery chain. Information seeking increasingly occurs in a 
variety of digital environments, with the ensuing need to adapt traditional library 
roles and services to meet the emerging needs and expectations of the "e-user" 
(for example, through the provision of online virtual reference services). 

The impact of digital technologies goes well beyond new forms of material 
in library collections. Even so, the rapid proliferation of digital content — infor- 
mation represented as ones and zeros instead of ink on paper — is the epicenter 
from which ancillary effects ripple out to other library spheres. Any systematic 
analysis of how digital technologies have transformed libraries would find a use- 
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fill starting point in examining how digital technologies have 
transformed library collections. 

This paper uses the OCLC Online Computer Library 
Center, Inc. WorldCat bibliographic database to examine 
questions relating to the growth of digital materials in library 
collections, including criteria for identifying digital materi- 
als algorithmically in MARC21 records: the quantity, types, 
characteristics, and holdings patterns of digital materials 
cataloged in WorldCat; and trends in WorldCat cataloging 
activity for digital materials over time. Issues pertaining to 
cataloging practice for digital materials and perspectives on 
digital holdings at the work level are also discussed. The 
purpose is to obtain a general understanding of the process 
by which digital materials have filtered into library col- 
lections over time, and to characterize the types of digital 
materials libraries have included in their collections. 

Taken together, the digital materials cataloged in 
WorldCat represent an aggregate collection, that is, the com- 
bined holdings of multiple institutions, viewed as a single 
unit. In the context of WorldCat, an aggregate collection can 
encompass the holdings of thousands of libraries. Analysis of 
aggregate collections affords a high-level perspective on his- 
torical patterns, suggests future trends, and supplies useful 
intelligence with which to inform decision making in a vari- 
ety of areas. Lavoie, Connaway, and Dempsey use aggregate 
collection analysis to examine the scope and implications of 
the Google Print for Libraries (now Google Book Search) 
project. 1 Lavoie and Schonfeld use similar techniques to 
examine the systemwide print book collection. 2 The present 
study also centers around an aggregate collection, in the 
form of the combined digital holdings in WorldCat. Analysis 
of this "aggregate digital collection" provides insight into the 
digital materials represented in WorldCat, trends in catalog- 
ing activity for digital materials, and reliable bibliographic 
criteria for automated identification of digital materials in 
library catalogs. This study is die first to consider digital 
library holdings from the perspective of an aggregate col- 
lection and is intended to provide a preliminary mapping of 
WorldCat's digital landscape. 

Rationale for the Study 

Several considerations motivated this study. First, estab- 
lishing reliable criteria for identifying and characterizing 
digital materials in MARC21 -based catalogs is of growing 
importance for libraries. Valuable data on digital holdings 
can be extracted from libraries' local integrated library sys- 
tems (ILS), as well as union catalogs like WorldCat. Reliable 
bibliographic criteria are needed to ensure that these data 
can be extracted using automated methods, are consistent 
in their interpretation, and can be meaningfully compared 
across collections. WorldCat is a good resource for obtaining 



these criteria, in that it represents a large pool of catalog- 
ing "evidence" that transcends local variations in cataloging 
rules and practice, and from which a robust, consistent set 
of criteria can be identified. This paper suggests a set of 
bibliographic criteria useful for broadly characterizing the 
materials in digital collections. 

A second consideration follows from die first. The abil- 
ity to extract useful data from local or union catalogs cre- 
ates opportunities to support decision making in a variety 
of areas. Digital collections are expanding in size, scope, 
and complexity. Effective management of these collections 
requires the gadiering and analysis of data to inform decision 
making. For example, a library may wish to have detailed 
information about its digital holdings in order to characterize 
the prevailing balance across various dimensions of the col- 
lection (material type, format, online access, and so on), and 
identify areas of need to guide future acquisitions. Analysis 
of local digital collections is important, but libraries can often 
benefit from a wider perspective. For example, a library con- 
sidering an investment in a digitization program may want to 
know what other libraries have already digitized, in order to 
avoid duplicative effort. Similarly, a library making an initial 
investment in digital collection development may want to 
know what types of digital materials have been collected by 
other libraries, perhaps as a means of identifying a core set of 
essential resources. This paper uses the digital materials cata- 
loged in WorldCat to illustrate some ways to analyze digital 
collections, either at the local or aggregate level. 

Advances in computing capacity, both in terms of pro- 
cessing power and storage, have made large-scale data min- 
ing feasible and economical for libraries. Results from data 
mining can be used to inform planning, allocate funding 
and staff, and facilitate cross-institutional collaboration. This 
paper hopefully will encourage libraries to think about new 
ways to utilize the bibliographic data in local systems and 
union catalogs to support digital collection management. 

A Note on Data Sources 

The analysis reported in this paper is based on a July 2005 
copy of WorldCat, containing 58,004,317 bibliographic 
records with 990,238,973 holdings. WorldCat is the world's 
largest bibliographic database, representing the combined 
holdings of more than 20,000 libraries worldwide. As such, it 
is a data source that supplies a uniquely broad perspective on 
digital materials in library collections. Using WorldCat limits 
the analysis to the digital materials that libraries have chosen 
to catalog in WorldCat. Unfortunately, no reliable estimate 
of the proportion of digital materials cataloged exists, let 
alone those that are included in WorldCat. Nevertheless, 
the fact remains diat WorldCat is the most comprehensive 
single data source for conducting an analysis of this kind. 
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Criteria for Identifying Digital Materials 

The first step in mapping out WorldCat's digital landscape 
is to establish borders around the territory of interest — in 
other words, to determine how many digital materials are 
cataloged in WorldCat. This requires a set of bibliographic 
criteria for identifying digital materials, based on informa- 
tion available in a MARC 21 record. 3 

This requirement is complicated by the fact that digital 
format can be indicated in multiple ways in a MARC record; 
moreover, cataloging practice for digital materials has been, 
and remains, in a state of flux. Weiss traces the evolution of 
cataloging practice for digital materials and notes: 

what has happened repeatedly with computer-based 
materials — a set of rules is issued and immediately 
superseded because of new developments in tech- 
nology. Another set of rules is issued to address tire 
shortfall. Catalogers are required to utilize multiple 
and sometimes conflicting cataloging standards in 
order to describe computer-based materials. 4 

Examination of MARC guidelines reveals a number 
of criteria, used either singly or in combination, drat could 
potentially identify a record that describes a digital resource. 
These include: 

• Type of Record = computer file (byte 6 of the leader 
equal to "m") 

• Form of Rem = electronic (byte 23 or byte 29 of the 
008 field equal to V) 

• General Material Designation (GMD) = electronic 
resource (subfield $h of the 245 field equal to "elec- 
tronic resource." Older GMDs for digital materials 
include "machine readable data file" and "computer 
file." These have been updated in WorldCat to reflect 
the current "electronic resource.") 

• Additional Materials/Form of Material = computer 
file/electronic resource (byte of 006 field equal to 
"m") 

• Physical Description = electronic resource (byte of 
007 field equal to "c") 

• Electronic Location and Access (2nd indicator of 856 
field equal to and there is no subfield $3) 

• Reproduction Note = electronic reproduction (subfield 
$a of 533 field equal to "electronic reproduction") 

The first three criteria (Type of Record, Form of Rem, 
and General Material Designation) are reliable indicators 
that the record describes a digital resource. The other four 
criteria are less reliable. Information in the 006 and 007 
fields can be problematic for automatic (diat is, machine- 
based) identification of digital materials, because these 
fields are repeatable and can apply either to the item 



described in the record, or to accompanying or related 
material. No prescribed ordering for repeated 006s or 007s 
helps resolve uhis issue. The 856 field is frequently mis- 
coded. For example, instances of the 856 field, with second 
indicator equal to zero and no subfield $3 and therefore 
ostensibly the network location of the resource described 
in the record, are sometimes incorrectly used to supply the 
Uniform Resource Locator (URL) of a Web site related to 
the item. Finally, the 533 field is problematic because the 
relevant information ("electronic reproduction" in subfield 
$a), while commonly used, is not mandatory and therefore 
may not appeal - . Another point to note about the 533 is that 
the record in which it appears describes the original, not the 
reproduction itself. This criterion was included, however, 
for two reasons: (1) die 533 describes a complete resource 
in its own right, and (2) if the digital reproduction was not 
catalogued separately the description in the 533 may be the 
only record of this material. 

Other combinations of bibliographic data probably exist 
that could be used to identify digital materials, but these 
combinations are unlikely to yield anything more than a 
negligible number of additional records. The criteria speci- 
fied previously should be sufficient to identify virtually all 
WorldCat records describing digital materials. 

A computer algorithm was developed that identifies all 
records in WorldCat satisfying one or more of the aforemen- 
tioned seven criteria. The algorithm was used to scan the 
July 2005 copy of WorldCat. The scan identified 1,015,072 
records satisfying at least one of the three reliable criteria 
(Type of Record, Form of Item, GMD). 

A second scan was done on the remaining records using 
the four less reliable criteria (Additional Materials/Form 
of Material, Physical Description, Electronic Location and 
Access, Reproduction Note). This yielded an additional 
169,437 records, a 17 percent increase over the previous 
total. Not all of these additional records actually describe 
digital materials, for the reasons mentioned previously. 

Identification of digital materials in WorldCat requires 
a balancing of two sometimes competing factors: precisian 
(minimizing the number of non-digital items falsely identi- 
fied as digital) and recall (maximizing the number of digital 
materials identified). If precision is the overriding concern, 
limiting the extraction parameters to the three reliable cri- 
teria is the best strategy; if the chief objective is recall, use 
of all seven criteria is preferable, even though this inevitably 
will result in a number of false matches. Since the number 
of additional records brought in by the four less reliable 
criteria is small (at most a 17 percent increase, in reality 
probably much less), the analysis reported in this paper is 
confined to the 1,015,072 records in WorldCat matching the 
Type of Record, Form of Item, or GMD criteria (or two or 
all three criteria) for digital materials. 

Two further points should be noted in regard to the 
records analyzed in this study. Audio compact discs (CDs), 
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such as music albums, and digital versatile discs (DVDs), 
such as movies, are forms of digital material. Standard 
cataloging practice for audio CDs seems to be to designate 
Type of Record = i or j (non-musical/musical sound record- 
ing), with a GMD of "sound recording"; the term "digital" 
is indicated in subfield b of the 300 field (physical descrip- 
tion-other physical details). Criteria for DVDs can also be 
identified. Including the CD and DVD criteria as indicators 
that the record describes digital material would, thus, be 
logical. Despite this, the researchers decided to exclude 
audio CDs and DVDs from the analysis. These materials 
constitute an important component of library collections in 
their own right; as such, they are a distinct class of materials 
and warrant separate study. 

Finally, analysis of the digital materials in WorldCat 
revealed that, in several instances, sets of books were 
represented in the database only at die collection level. 
For example, four digital collections — Eighteenth Century 
Collections Online (-150,000 titles), Early English Books 
Online (-100,000 titles), Early American Imprints: Series 
I, 1639-1800 (-36,000 titles), and PsycBooks (-850 titles, 
-13,800 chapters) — are each treated as a continuous 
resource and represented in WorldCat by a single record. 
Another extensive digital collection, Gutenberg-e, is rep- 
resented in WorldCat by a record describing the collection 
as a whole, as well as several additional records describing 
some of the individual titles. Collection-level cataloging 
implies that simple record counts will understate the num- 
ber of digital materials actually represented in WorldCat. 
Efforts are currently underway to extend the granularity of 
e-resource collection descriptions in WorldCat. In order to 
identify library electronic resource holdings in WorldCat at 
the item level, OCLC has integrated the Openly Informatics 
database. It not only provides metadata for resources in 
digital format, including books, serials, audiobooks, theses, 
and dissertations, but also identifies and updates libraries' 
digital resource holdings. This ensures that libraries' digital 
resource holdings are current and accurate, enabling audren- 
ticated end users to access full-text online content through 
direct links to content aggregators through WorldCat. 

In sum, the more than one million WorldCat records 
identified using the Type of Record, Form of Item, or 
GMD digital criteria do not perfectly reflect all digital 
materials held by libraries. Therefore, a key point that 
should be emphasized is that the analysis that follows can 
be interpreted as nothing more than a characterization of 
the digital materials cataloged in WorldCat, and not as a 
characterization of all digital materials held in library collec- 
tions. Nevertheless, digital materials cataloged in WorldCat 
provide a broad sample of library digital collection decisions 
and cataloging practices over more than three decades. 



The WorldCat Digital Landscape 

As of July 2005, approximately one million digital materi- 
als of all descriptions were cataloged in WorldCat. These 
records constitute about 2 percent of the total records in 
WorldCat. The proportion of WorldCat devoted to digital 
materials is as yet quite small, but indications are that this 
figure is trending upward. Comparison of the July 2005 
totals with those from a year earlier suggests that the num- 
ber of digital materials cataloged in WorldCat is growing 
rapidly. The July 2005 total of more than one million digital 
materials represents a 35 percent increase over the total for 
July 2004 (about 750,000). Over this same period, WorldCat 
as a whole grew by about 9 percent, so the number of 
WorldCat records describing digital materials grew nearly 
four times faster dran the database as a whole. 

Returning to the figures for July 2005, the one million 
WorldCat records describing digital materials had a total 
of 30,773,412 holdings attached to them. These holdings 
account for approximately 3 percent of all WorldCat hold- 
ings. On average, then, a WorldCat record describing a 
digital resource has about 30 holdings attached to it. This is 
misleading, however, because the distribution is skewed and 
only about 14 percent of these records actually have 30 or 
more holdings attached. The median number of holdings for 
a WorldCat record describing a digital resource is only one. 

The top ten most widely held digital resources in 
WorldCat as of July 2005 were: 

1. Bipolar Disorders: A Guide to Helping Children ir 
Adolescents (M. Waltz): 1,340 holdings 

2. The Dictionary of Space Technology (J. Angelo): 1,328 
holdings 

3. Eating Disorders: A Reference Sourcebook (R. Lemburg 
and L. Cohn): 1,284 holdings 

4. The Mafia Encyclopedia (C. Sifakis): 1,272 holdings 

5. A Dictionary of Zoology (M. Allaby): 1,266 holdings 

6. The Greenspan Effect: Words That Move the World's 
Markets (D. Sicilia and J. Cruikshank): 1,264 holdings 

7. US v. Microsoft (J. Brinkley and S. Lohr): 1,261 hold- 
ings 

8. The Internet Edge: Social, Legal, and Technological 
Challenges for a 'Networked World (M. Stefik): 1,261 
holdings 

9. African- American Art (S. Patton): 1,260 holdings 

10. Ace Your Midterms and Finals: Principles of Economics 
(A. Axelrod): 1,259 holdings 

All of these titles are e-books offered through OCLC's 
NetLibrary service. This result is not surprising, because the 
NetLibrary e-book service has been integrated into libraries' 
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WorldCat cataloging workflow; for example, libraries who 
build NetLibrary e-book collections have their holdings set 
automatically in WorldCat. 

The top ten most widely held digital resources in 
WorldCat, excluding NetLibrary e-books, are: 

1. Where to Write for Vital Records (National Center for 
Health Statistics): 1,112 holdings (Web site) 

2. Alzheimer's Disease: Methods and Protocols (N. 
Hooper): 647 holdings (e-book) 

3. Statistical Abstract of the United States (US gov't): 625 
holdings (CD-ROM) 

4. County Business Patterns (US gov't): 589 holdings 
(CD-ROM) 

5. The National Trade Data Bank (US gov't): 585 holdings 
(CD-ROM) 

6. The Budget of the United States Government (US 
gov't): 560 holdings (CD-ROM) 

7. Faith in Every Footstep, 1847-1997: 150 Years of 
Mormon Pioneers (Church of Jesus Christ of Latter- 
day Saints): 555 holdings (CD-ROM) 

8. USA Counties (US gov't): 541 holdings (CD-ROM) 

9. Crime in the United States (US gov't): 534 holdings 
(CD-ROM) 

10. REIS: Regional Economic Information System (US 
gov't): 502 holdings (CD-ROM) 

This list suggests that first, widely held digital items 
(apart from NetLibrary e-books) are primarily government 
publications, and second, these publications are stored on a 
physical container, that is, CD-ROM discs. 

In general, holdings of digital materials were widely dis- 
persed. Table 1 reports the holdings distribution for all digi- 
tal materials identified in the July 2005 copy of WorldCat. 

Nearly 60 percent of the digital materials cataloged 
in WorldCat have only a single holding attached. In com- 
parison, an analysis of print books cataloged in WorldCat 
as of January 2005 indicates that 37 percent were uniquely 
held. In other words, nearly double the proportion of digital 
materials are uniquely held compared to print books. 

Interpretation of this result is difficult with the data 
available. It could reflect a general dissimilarity across digi- 



Table 1. Holdings pattern for digital materials 



Number of Holdings % of Digital Materials Cumulative (%) 

1 59 59 

2-10 23 82 

11-100 8 90 

>100 6 96* 
*About 4 percent of the records describing digital materials had no 
holdings attached. 



tal collections (evidenced by only a small proportion of digi- 
tal materials being widely held). It could also reflect a lack 
of convergence across libraries in regard to cataloging or 
attaching holdings to digital resources. The most likely sce- 
nario, however, involves some combination of both factors. 

Online versus Offline 

One key advantage of the digital format is that materials can 
be accessed over a network from geographically dispersed 
locations. The ability to access material remotely from the 
desktop is increasingly becoming an expectation among 
library users. Knowing how many of the digital materials 
cataloged in WorldCat are available online is therefore 
important. 

In principle, online materials can be identified by the 
presence of an 856 field, with a second indicator of zero and 
no subfield 3. A second indicator of zero indicates that the 
URL given in die field pertains to the material described in 
the record; the absence of a subfield 3 implies that the entire 
item is available online rather than just a portion of it. 

Running these criteria against the more than one mil- 
lion digital materials cataloged in WorldCat indicates that 
almost half are available online, but this number is likely 
a low-end estimate. An inspection of the records failing 
the 856 field criteria (that is, records representing digital 
resources that are ostensibly offline) reveals that the situa- 
tion is more nuanced than a straightforward application of 
the 856 field criteria would suggest. 

A random sample of 100 records was drawn from the 
collection of offline records. Analysis of the records reveals 
they can be grouped into three broad categories. Forty per- 
cent of the sample were records describing resources that 
were clearly offline (for example, software or data stored on 
CD-ROM or other physical containers). 

A slightly larger proportion, 44 percent, was records 
describing resources that appeared to be available online, 
but for one reason or another failed the 856 field criteria. 
Some 856 fields in these records supplied URLs that did 
not point to the resource itself (and therefore the second 
indicator was not zero); for example, digital content avail- 
able through license or subscription, where the URL in 
the 856 field points not to the resource itself, but to some 
form of mediation page where the user can log in to obtain 
access or ordering information. In other cases, die URL 
pointed to the resource, but the second indicator was left 
blank (no information). Some cases show what appear to be 
non-standard uses of the second indicator or subfield 3 even 
when the URL does in fact point to the resource in ques- 
tion. Another example is where the record indicates that the 
resource is available through the Web (usually in the 533 
field), but no 856 field, and therefore no URL, is provided. 
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The remainder (14 percent) are records where it was 
not clear from the information available whether or not the 
resource described was available online. Examples include 
resources where the 856 field points to an ordering page, 
publication information, or even the publisher's home page, 
but whether the content could be accessed online is not 
clear. 

Extrapolating these results to all records failing the 
standard online criteria suggests that anywhere from 73 to 
80 percent of the digital materials in WorldCat are actually 
available online, compared to the approximately 50 percent 
indicated by matching the standard 856 field criteria. Only 
about two-thirds of these online materials can be reliably 
identified using machine processing. Adoption of cataloging 
practices that permitted a reliable distinction between online 
and offline digital materials, obtained through machine pro- 
cessing of the record rather than human inspection, would 
be beneficial in organizing and presenting search results in 
library catalogs. 



Cataloging Activity 

The earliest confirmed record in WorldCat describing a digi- 
tal resource (that is, the one with the lowest OCLC number) 
is record #1617882, created on September 11, 1975, by the 
American Antiquarian Society and entered into WorldCat 
later that year. The record describes a data file, recorded 
on a single tape reel, containing 1860 and 1880 U.S. census 
data on residents of Worcester, Massachusetts. 

Since that time, more than one million additional 
records for digital resources have been added to WorldCat. 
Only in the last few years has the flow of records describ- 
ing digital resources been significant. Table 2 shows the 
number of records describing digital materials entered into 
WorldCat for each year between 1975 (the year the first 
digital record was entered) and 2005. 

Several years exhibit significant jumps compared to the 
previous year, for example, 1984 (833 records) compared 
to 1983 (133 records); and 1985 (5,204 records) compared 
to 1984 (833 records). Only in 1992 does a steady accelera- 
tion become evident; the yearly total increased from 5,750 
records in 1992 to 31,020 records by 1999. In 2000, catalog- 
ing of digital materials in WorldCat spiked, rising to 166,961 
records. From this point onward, the annual total of digital 
materials cataloged in WorldCat has never fallen below 
110,000, suggesting that the dramatic increase witnessed in 
2000 was the catalyst for a sustained movement to higher 
levels of cataloging activity for digital materials. 

The majority of digital materials cataloged in WorldCat 
as of July 2005 were entered in the last few years. Eighty-five 
percent of these records were entered in 2000 or later — that 
is, in the previous five and a half years. Only about 1 per- 



Table 2. Distribution of records 
by year entered in WorldCat, 



Number of 



cent were entered prior to 
1986. This suggests that 
cataloging of digital mate- 1975-2005 
rials in WorldCat is a fairly 
recent phenomenon, con- 
fined for the most part to 
the last half-decade, even 
though the second edi- 
tion of Anglo-Anwrican 
Cataloging Rules (AACR2) 
incorporated rules for 
cataloging digital materi- 
als more than twenty-five 
years ago in 1978, and the 
era of personal comput- 
ing dates from roughly the 
same time, with the intro- 
duction of the Apple II 
in 1977 and the IBM PC 
in 1981. 5 

Another interesting 
characteristic of WorldCat 
cataloging activityfordigital 
materials is the proportion 
of records originating from 
the Library of Congress 
compared to the propor- 
tion contributed by the 
OCLC membership. Using 
the presence of "DLC" in 
the 040 subfields $a and 
$c to identify a Library of 
Congress record (that is, 
the record was both cre- 
ated and transcribed by 
the Library of Congress), 
analysis revealed that 
16,826 records describing 
digital materials, or about 
2 percent, were created by 
the Library of Congress. 

In comparison, about 11 percent of WorldCat as a whole 
consists of Library of Congress records, suggesting that 
WorldCat records describing digital materials are much 
more likely to be contributed records than the average 
WorldCat record. Further work is needed to understand 
the implications of this finding, but one can surmise that the 
disparity reflects the fact that many digital materials do not 
yet fit the pattern of the types of materials usually cataloged 
by Library of Congress. It might also provide some explana- 
tion for the wide variance in cataloging practice for digital 
materials, since contributed records will reflect the practices 
and policies of a variety of institutional contexts. 
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Types of Materials 

Cataloging rules for digital materials have undergone a shift 
in focus from emphasizing the form of die item (that is, its 
digital format) to emphasizing its content, or material type. 
Weiss discusses this point in her paper. 6 To some degree, 
this shift has been necessitated by the rapidly expanding 
range of materials available in digital form, which has in turn 
been reflected in libraries' digital collections. The shift has 
led to a need for increasingly granular descriptions of digital 
materials; in other words, segregating a library's digital hold- 
ings as a single, monolithic portion of the collection is not 
sufficient. Table 3 provides a breakdown of the WorldCat 
records describing digital materials according to the MARC 
Bibliographic Level categories. 

Monographs clearly account for the vast majority of 
digital materials (85 percent). The only other categories of 
significance are serials (9 percent) and monographic compo- 
nent parts (5 percent). Monographic materials encompass a 
fairly wide range of information resources, however, so it is 
helpful to consider a different view of the digital materials in 
WorldCat, based on the MARC Type of Record categories. 
This distribution is provided in table 4. 

Nearly three-quarters of the digital materials in 
WorldCat are some form of language material. Again, this is 
a fairly wide-ranging category. A further breakdown of the 
digital language materials according to some well-known 
material types, shown in table 5, provides still more insight 
into die types of digital materials held in library collections. 
Books in digital form constitute the largest proportion of 
digital language materials. Government documents also 
claim a significant proportion, as do e-journals. 

Tracking the change in the mix of digital material types 
over the years is interesting. Table 6 shows the distribu- 
tion of records across Type of Record categories for three 
periods: 1985 and earlier, 1986 through 1995, and 1996 
and later. The results in table 6 indicate a profound shift in 
the types of digital materials held by libraries. Virtually all 
digital materials cataloged in WorldCat in 1985 or earlier 
(99 percent) were described as "computer files." In contrast, 
more than three-quarters of the digital materials cataloged 
in WorldCat in 1996 or later were designated as "language 
materials," with only 18 percent designated as "computer 
files." The other major point revealed by these data is the 
significant expansion in the range of materials falling into the 
digital category. Digital materials cataloged during or before 
1985 were predominantly in two categories: computer files 
and language materials. Only two other categories (project- 
ed medium and kit) were represented. Between 1986 and 
1995, the range of material types showing up in WorldCat 
widened appreciably. Computer files and language materials 
were still the only categories widi significant representation, 
but seven additional material types were also represented. 



Table 3. Distribution of records by MARC bibliographic level 



Bibliographic Level 


Number 


% 


Monograph 


863,620 


85 


Serial 


90,624 


9 


Monographic component part 


49,551 


5 


Subunit 


8,655 


1 


Serial component part 


1,568 


<1 


Collection 


1,054 


<1 


Integrating resource 










Table 4. Distribution of records by MARC type of record 


Type of Record 


Number 


% 


Language material 


726,299 


72 


Computer file 


234,691 


23 


Two-dimensional non-projected medium 


22,870 


2 


Cartographic material 


14,786 


1 


Manuscript language material 


4.735 


<1 


Non-musical sound recording 


3,978 


<1 


Musical sound recording 


3,917 


<1 


Projected medium 


1,986 


<1 


Notated music 


1,515 


<1 


Kit 


120 


<1 


Mixed material 


115 


<1 


Manuscript cartographic material 


31 


<1 


Manuscript notated music 


23 


<1 


Three-dimensional artifact or natural object 


6 


<1 



Table 5. Types of digital language materials 



Material Type 


Number 


% 


Monographic language materials (books) 


472,680 


65 


Government documents* 


114,185 


16 


Language-based serials (journals) 


67,861 


9 


Theses/dissertations* 


28,911 


4 


Other 


42,662 


6 



*Government documents were identified on the basis of information in the 
008 field, while theses and dissertations were identified on the basis of the 
existence of the 502 field. 

Between 1996 and 2005, five material types (language 
materials, computer files, two-dimensional non-projected 
medium, cartographic material, and manuscript language 
materials) displayed significant representation, while nine 
other categories were also represented. 

At least part of the difference exhibited across time in 
the range of digital materials reflects changes in cataloging 



51(2) LRTS 



Mapping WorldCat's Digital Landscape 113 



practice for digital materials rather than changes in the types 
of digital materials cataloged and entered into WorldCat. As 
noted previously, early cataloging rales for digital materials 
tended to emphasize form over content; in other words, the 
most significant property of digital materials was the fact 
that they were digital. As cataloging rules evolved, form 
was de-emphasized in favor of material type and subject 
area. Knowing that a resource was a computer file was not 
enough; the fact that it was an e-book or e-journal was also 
important. In light of this, at least part of the expansion 
over time in the range of digital material types is likely the 
result of changes in methods of bibliographic description, 
suggesting that the relatively narrow range of material types 
identified in early years (pre- 1985) may mask a wider variety 
of materials lumped together under the single category of 
"computer file." 

Other factors leading to the observed differences over 
time in the range of digital material types in WorldCat are 
changing collection development policies and an expand- 
ing diversity in the types of digital materials available for 
acquisition. For example, libraries currently likely have a 
lower propensity to acquire and catalog "shrink-wrapped 
software" (that is, computer files) and a greater propensity 
to acquire online content, such as e-books and e-journals, 
than in the past. Moreover, many forms of online content 
were simply not widely available until the mid- to late 1990s. 
Further work is needed to analyze trends in the types of 
digital materials available for acquisition, as well as changes 
in collection development policies for digital materials. 



"Digital Works" 



by Washington Square Press in 2004, is a manifestation of 
the work Macbeth. A single work can have multiple manifes- 
tations associated with it. 

WorldCat records describe manifestations. The finding 
that uhere are more than one million digital materials cata- 
loged in WorldCat is equivalent to saying that more than one 
million digital manifestations are cataloged in WorldCat. 
This in turn invites the question of how many distinct works 
are represented by these digital manifestations. To answer 
this question, the FRBR work set algorithm developed by 
OCLC Research was used to cluster the more than one mil- 
lion WorldCat records describing digital materials into their 
associated works. The OCLC Research work set algorithm 
converts MARC21 bibliographic databases into FRBR work 
sets, where a work set is a cluster of all records (that is, mani- 
festations) pertaining to the same work. s 

The 1,015,072 digital manifestations in WorldCat can 
be rolled up into 921,095 distinct works. As of July 2005, 
46,155,940 distinct works were represented in WorldCat as 
a whole, so only about 2 percent of the works in WorldCat 
contain at least one digital manifestation. This is a remark- 
ably small number and suggests that there is tremendous 
scope for mass digitization programs. 

On average, a "digital work" in WorldCat (that is, a work 
containing at least one digital manifestation) will include 
1.1 digital manifestations, a result not significantly differ- 
ent from 1. In comparison, the average work in WorldCat, 
taking into account all formats, contains approximately 1.3 
manifestations. In practice, works can vary considerably in 
the number of manifestations associated with them. Table 
7 shows the distribution in the size of "digital works." The 



A great deal of recent work has 
focused on aggregating, managing, 
and displaying bibliographic data 
at multiple levels of granularity. 
Work in this area is underpinned 
by the Functional Requirements 
for Bibliographic Records (FRBR) 
model, a framework for articu- 
lating the relationships between 
bibliographic entities, including 
works, expressions, manifesta- 
tions, and items. FRBR defines 
a work as "a distinct intellec- 
tual or artistic creation."' Thus, 
Macbeth is a work. A manifesta- 
tion, on the other hand, is a physi- 
cal embodiment of an expression 
of a work. Thus, the Folger 
Shakespeare Library edition of 
Macbeth, published in paperback 



Table 6. Distribution of records by type of record and period 





1985 and earlier 


1986-1995* 


1996 and later 


All years 


Type of Record 


(%) 


(%) 


(%) 


(%) 


Language material 


1 


4 


77 


72 


Computer file 


99 


96 


18 


23 


2-dim. non-projected medium 




<1 


2 


2 


Cartographic material 




<1 


2 


1 


Manuscript language material 




<1 


1 


<1 


Non-musical sound recording 






<1 


<1 


Musical sound recording 




<1 


<1 


<1 


Projected medium 


<1 


<1 


<1 


<1 


Notated music 






<1 


<1 


Kit 


<1 


<1 


<1 


<1 


Mixed material 




1 


<1 


<1 


Manuscript cartographic material 






<1 


<1 


Manuscript notated music 






<1 


<1 


3-dim. artifact/natural object 






<1 


<1 



*Percentages do not add up to 100 due to rounding. 
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results in table 7 indicate that 667,124 (nearly three-quar- 
ters) of the 921,095 works containing at least one digital 
manifestation are single manifestation works. In other 
words, the work consists of one manifestation, which is a 
digital object. This would suggest that most "digital works" 
in WorldCat (that is, works with at least one digital manifes- 
tation) are, in fact, works that are "born-digital" (that is, have 
no antecedents in the print world). This hypothesis must be 
advanced with some caution; other non-digital manifesta- 
tions may exist for these works, but have simply not been 
cataloged in WorldCat. 

To gain more insight into this issue, a random sample 
of 100 single-manifestation "digital works" was chosen for 
manual inspection. These records represent a fairly diverse 
set of materials, including a number of materials that were 
definitely born-digital (for example, Web sites and soft- 
ware) as well as other materials that are likely to have been 
born-digital (for example, government reports, theses, and 
dissertations). Other materials, such as books and serials, 
are more questionable. For these materials, the reason they 
appeal - as single-manifestation digital works is likely because 
other non-digital manifestations have not been cataloged in 
WorldCat, or were cataloged differently. Scanned images of 
historical artifacts are likely to fall into this category. 

These conclusions are hardly more than speculation. A 
good topic for future research would be to look at the "digi- 
tal works" in WorldCat and try to determine how many are, 
indeed, single manifestation, born-digital works or whether 
other manifestations also exist. This information can be of 
vital importance in a number of library decision-making 
contexts, such as preservation. 

Conclusion 

The ultimate significance of digital materials in library col- 
lections is not their growth in number and diversity. Rather, 



Table 7. Distribution of "Digital Works" by size 
(number of manifestations) 



Work Size (# of Manifestations) 


Number 


% 


l 


667,124 


72 


2 


138,322 


15 


3 


56,771 


6 


4 


20,820 


2 


5 


9,639 


1 


6-10 


15,559 


2 


11-100 


11,155 


1 


>100 


1,705 


<1 



it is the opportunities they present for meeting the needs of 
users who increasingly operate in networked digital spaces. 
In this sense, a study of the number, type, and features of 
digital materials in WorldCat — a study solely confined to 
the digital materials themselves — is necessarily incomplete. 
Further work is needed to understand how these digital 
materials can be incorporated into a range of information 
environments and linked to emergent user behaviors. 

As of July 2005, WorldCat contained more than one mil- 
lion records describing digital resources, to which more than 
30 million holdings have been attached. While the number 
of digital materials cataloged in WorldCat is still proportion- 
ately small, it is clearly a growing segment in terms of both 
size and importance, reflecting similar trends in individual 
library collections. These digital materials form the digital 
landscape through which future workflows, services, and 
user interactions must navigate. As digital materials con- 
tinue to proliferate in library collections, this landscape will 
expand and exhibit increasingly complex features; conse- 
quently, libraries will require detailed information about 
their digital holdings to support collection management 
decisions. Being able to isolate digital materials in a collec- 
tion for automated analysis will therefore be important, but 
these materials cannot be viewed monolithically. Analysis 
must proceed on a more granular level, as libraries will wish 
to know not only the size of their digital collections, but also 
how these collections measure up along multiple dimen- 
sions, such as material type (for example, books, e-journals, 
and software) and mode of access (for example, online ver- 
sus offline). 

As libraries look for innovative, efficient ways to man- 
age their digital holdings, some analysis may be directed at 
the level of the aggregate collection — that is, the combined 
holdings of multiple institutions. Analysis of aggregate digi- 
tal collections (where aggregation can occur on a consortial, 
regional, national, or even international basis) facilitates 
direct collaboration between libraries in a variety of areas, 
such as mass digitization or cooperative collection develop- 
ment. It also allows individual libraries to make decisions 
placed against a larger context, which in turn helps foster 
convergence in areas where this is important, and avoid 
duplication in others. 

Because WorldCat represents die aggregate holdings 
of thousands of libraries, it offers a unique perspective on 
the incorporation of digital materials into library collections. 
It also points to some limitations concerning legacy biblio- 
graphic data for digital materials. Because digital materials 
have been subject to a particularly fluid evolution of catalog- 
ing practice and acquisition mediods, repurposing legacy 
bibliographic data to meet the new uses emerging from 
networked digital environments for research and learning 
becomes correspondingly more difficult. Stabilization of cat- 
aloging rales for digital materials would help greatly in this 
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regard. In addition, new practices need to be adopted for 
cataloging the output of mass digitization programs. Success 
in both of these areas will facilitate automated scanning and 
processing of bibliographic databases, which in turn will 
support views of the information contained within that are 
tailored to the needs of "e-learners" and "e-researchers." 
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Notes on Operations 

Application Profile Development 
for Consortial Digital Libraries 

An OhioLINK Case Study 

By Emily A. Hicks, Jody Perkins, 
and Margaret Beecher Maurer 

In 2002, OhioLINK's consortia of libraries recognized the need to restructure 
and standardize the metadata used in the OhioLINK Digital Media Center as 
a step in the development of a general purpose digital object repository. The 
authors explore the concept of digital object repositories and mechanisms used to 
develop complex data structures in a cooperative environment, report the findings 
and recommendations of the OhioLINK Database Management and Standards 
Committee (DMSC) Metadata Task Force, and identify lessons learned, address- 
ing data structures as well as data content standards. A significant result of the 
work was the creation of the OhioLINK Digital Media Center (DMC) Metadata 
Application Profile and the implementation of a core set of metadata elements and 
Dublin Core Metadata Element Set mappings for use in OhioLINK digital proj- 
ects. The profile and core set of metatadata elements are described. 
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Introduction 

Digital repositories have evolved 
from relatively simple collections 
of digital objects with individual meta- 
data schemas to complex online envi- 
ronments needing reliable and flexible 
metadata structures to accommodate 
differing demands, platforms, and 
services. One example of this trend, 
the OhioLINK Digital Media Center 
(DMC) developed out of a statewide 
collaborative environment and contin- 
ues to be redefined to meet the needs 
of cooperating libraries. 1 OhioLINK, 
the Ohio Library and Information 
Network, is a consortium of eighty- 
five college and university libraries 
and the State Library of Ohio. The 
goal of OhioLINK is to provide easy 
access to information and swift deliv- 
ery of materials throughout the state. 
OhioLINK services include a cen- 
tral online catalog, shared electronic 
resources, a electronic dieses and dis- 
sertations center, and an environment 
for digital project development and 
access. 



By 2002, five years after the DMC 
was established, the need to restruc- 
ture and standardize the metadata was 
clear to OhioLINK staff and member 
libraries. The DMC provides access to 
a variety of digital media assets includ- 
ing image, sound, and video files from 
OhioLINK institutions, other partner 
organizations, and commercial ven- 
dors. A series of subject-specific data- 
bases had been created, each wiuh 
a separate, discipline-appropriate 
metadata scheme. Little attempt had 
been made to standardize information 
across the databases and searching was 
limited to one database at a time. 

OhioLINK's Database Manage- 
ment and Standards Committee 
(DMSC), composed of technical ser- 
vices representatives from OhioLINK 
member institutions, appointed the 
OhioLINK Database Management 
and Standards Committee (DMSC) 
Metadata Task Force in spring 2003. 
The Task Force was charged with 
providing direction to the DMSC and 
OhioLINK on the development of the 
DMC, surveying current and emerg- 
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ing metadata standards, and drafting 
guidelines for the use of metadata in 
the DMC. 

The primary result of the Task 
Force's work is the OhioLINK 
DMC Metadata Application Profile. 2 
Complex environmental and histori- 
cal factors and the great diversity of 
needs within the OhioLINK envi- 
ronment informed the application 
profile creation process. This paper 
describes the mechanisms used to 
foster the evolution of data structures 
in a cooperative environment and 
discusses specific decisions and find- 
ings that resulted in the creation of 
an application profile, including the 
identification of a core set of meta- 
data elements. The paper presents the 
Task Force's findings, lessons learned, 
and recommendations, addressing 
data structures as well as data content 
standards. Finally, the paper describes 
the current status of the DMC as 
well as plans to incorporate the DMC 
into the OhioLINK Digital Resource 
Commons (DRC). 3 

A Review of the Literature 

Identifying Appropriate Metadata 

Several studies have shown that qual- 
ity metadata is an important compo- 
nent of digital collections. In their 
article about the challenges of meta- 
data in university digital libraries, 
Attig, Copeland, and Pelikan assert 
that successful digital libraries must 
have a "robust metadata structure 
that can accommodate and preserve a 
variety of discipline-specific metadata 
while supporting consistent access 
across collections." 4 In a 2004 study 
of Australian digital collections, Hider 
finds that respondents think using 
already established standards when 
describing digital collections is very 
important.' Bruce and Hillmann point 
out that while the library community is 
comfortable with attempting to quan- 
tify and measure quality, as evidenced 
by the acceptance of the BIBCO core 



record, this acceptance must take 
place at the community level, and that 
"most metadata communities outside 
of libraries are not yet at the point 
where they have begun to define, 
much less measure, quality." 6 Dushay 
and Hillmann adapt a commercially 
available visual graphical analysis tool 
to evaluate metadata, with the aim of 
developing a tool for efficiently analyz- 
ing large databases of metadata. ' 

Broad agreement on what consti- 
tutes good metadata, or even appro- 
priate metadata is difficult. Scalability 
and relevance have been identified by 
Intner, Lazinger, and Weihs as features 
of good metadata as well as "adequate 
description of the kinds of data ele- 
ments for which the library's users 
search." 8 This last factor can vary wide- 
ly within any consortium's commu- 
nity. Researchers also have found that 
designing elaborately perfect metadata 
schemas may not help provide access 
in the absence of good data. Attig, 
Copeland, and Pelikan write that they 
"were forced to ask how little meta- 
data would be required for discovery" 
and that "this question is particularly 
important for image data." 9 

According to the National 
Information Standards Organization 
(NISO) framework, good metadata is 
appropriate for the materials in the 
collection, users of the collection, and 
intended current and likely use of the 
digital object; supports interoperabil- 
ity; uses standard controlled vocabu- 
laries to reflect die what, where, when, 
and who of the content; includes a 
clear statement on the conditions and 
terms of use for the digital object; is 
auuhoritative and verifiable; and sup- 
ports the long-term management of 
objects in collections. 10 

Specific guidelines, such as the 
Computer Interchange of Museum 
Information's (CIMI) "Guide to 
Best Practice: Dublin Core" and the 
Collaborative Digitization Program's 
(CDP) "Dublin Core Metadata Best 
Practices," provide a more detailed 
account of implementing the metadata 



component of digital projects. 11 These 
guidelines typically include element- 
level guidance on semantics (how to 
interpret an element), syntax (how 
to format the data that populates an 
element), and recommended value 
domains (what controlled vocabular- 
ies, coding schemes, etc. are valid for a 
given element). The CIMI document 
guides the implementation of Dublin 
Core (DC) in a museum environment, 
presenting element level guidelines 
for all of the fifteen elements in DC 
Simple. 12 

Information environments also 
can heavily affect metadata imple- 
mentation. Providing access to digital 
libraries differs significantly from pro- 
viding access to traditional libraries. 
Intner, Lazinger, and Weihs note that 
the very fact drat the items being 
described are online is the "most 
important and obvious difference." 13 
The authors go on to say that: 

Digital libraries are likely to 
be very large, quickly grow- 
ing, frequently changing data- 
bases; they are likely to be 
collaborative efforts; they are 
likely to include more diverse 
types of materials; and their 
users do very little searching 
while they are at the digital 
library's home institution, if 
it has only one. As a result, 
asking a librarian how to 
find something one believes 
should be in the database but 
does not show up in answer 
to a search query may not be 
an option. . . . Without stan- 
dard methods for describing 
database documents and their 
contents, maintaining author- 
ity control, and so on, access 
to the documents suffers. 14 

Baca concurs in her article about 
applying metadata schemas and con- 
trolled vocabularies, stating that the 
metadata standard for cultural heritage 
institutions must be "appropriate to 
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the materials in hand and the intended 
end-users must be selected." 1 

In an article titled "Developing a 
Metadata Strategy," Agnew details the 
steps involved in building a metadata 
repository, including "modeling the 
information needs of your commu- 
nity, selecting and adapting a metadata 
standard, documenting your metadata, 
populating the database and sharing 
your metadata with other repositories 
and metadata initiatives." 16 OhioLINK 
institutions emphasize the importance 
of that consortial community. Bauer 
and Carlin explain that the DMC is 
specifically designed to eliminate bar- 
riers to institutional participation and 
they encourage OhioLINK institutions 
to focus on "content creation, acquisi- 
tion and development, thus promoting 
the true nature of an academic collab- 
orative venture." 1 ' The impact of this 
perspective on the quality of the DMC 
legacy data will be discussed later in 
this paper. 

Cooperative communities have 
historically struggled to reconcile 
their independent metadata systems, 
comprised of legacy data, even in 
the MARC environment where stan- 
dards are far more secure. Bruce 
and HiUmann comment that "legacy 
data presents special problems for 
many communities, as it rarely makes 
a clean transition into new metada- 
ta formats." 18 Bishoff and Meagher 
find no compelling reason to require 
institutions with legacy data to create 
new records since "economic real- 
ity requires this level of flexibility." 19 
Cromwell-Kessler points out that the 
retrospective conversion of already 
existing legacy data is "expensive and 
time-consuming. Where no single 
standard exists, integration will entail 
'translating' from one structured data 
system to another." 20 

Bishoff and Meagher perceive the 
challenge for collaborative projects to 
be the integration of separate collec- 
tions "using a common set of metadata 
standards while retaining the unique 
character of each collection." 21 A 2004 



Australian study of digital collections 
found that almost all of the institu- 
tions surveyed valued standardized 
metadata and federated search func- 
tionality and that most were work- 
ing toward interoperability. 22 Chopey 
reasons, "Because metadata for digital 
collections is not likely to be stored for 
use by any institution except the one 
creating and maintaining it, the driv- 
ing force behind the development of 
metadata standards for digital collec- 
tions in the future is most likely to be a 
desire for uniform access methodology 
across collections." 23 Intner, Lazinger, 
and Weihs state, "Given the choice 
between a perfect but unique metada- 
ta schema utterly lacking in interoper- 
ability and a moderately good schema 
that gets high marks for interoper- 
ability, most experts recommend the 
latter . . . [because] in a collaborative 
environment interoperability trumps 
perfection every time." 24 

If interoperability is the key, how 
is it attained? Much has been writ- 
ten on the process of cross-walking 
or data mapping between metadata 
systems, as well as on the integration 
of disparate metadata systems within 
a single database. Cromwell-Kessler 
says that die process entails "difficult 
decisions about how to handle com- 
plex data issues." 25 Baca writes about 
the importance of the selection of 
appropriate metadata schemas and the 
role of metadata mapping and cross- 
walks. 26 Bishoff and Meagher discuss 
how a collaborative project developed 
a matrix to look at common elements 
across metadata standards. 2 ' 

The Collaborative Digitization 
Program (CDP), formerly known as 
the Colorado Digitization Project, 
experienced many of these issues. 28 
As early as 2000, Allen described the 
collaborations inherent in the project 
and the results, noting the great need 
for good communications and plan- 
ning within the collaborative environ- 
ment, stating "[t]he risks relate to 
quality of the digital objects, digital 
preservation, and quality of metadata, 



and these risks must be ameliorated 
through extensive education and train- 
ing." 29 The program focuses on the 
importance of learning through doing, 
and recognizes drat there are unique 
challenges in cooperative projects. 30 
According to Intner, Lazinger, and 
Weihs, the CDP is currently in the 
middle of its second strategic plan and 
doing well. 31 

Attig, Copeland, and Pelikan 
study the deployment of three sepa- 
rate metadata schema within a single 
database by creating a merged super- 
set of all the elements in the three 
standards. 32 Although this exercise 
proves to be relatively uncomplicated, 
it does not ensure true interoper- 
ability. According to Attig, Copeland, 
and Pelikan, "The main difficulties 
concern the meaning of the values 
contained in the elements. . . . They 
may arise out of contextual differences 
in the use of language in different 
disciplines or differences in the role 
that the data element itself plays in 
imparting meaning to the values (the 
hierarchical context). Regardless of 
the source of the differences, mapping 
is about meaning." 33 

Baca advocates the use of struc- 
tured vocabularies and thesauri for 
populating metadata schemas "to 
increase both precision and recall in 
end-user retrieval." 34 Metadata cre- 
ated by the contributors can be cre- 
ated more quickly and earlier in the 
information life cycle for rapidly grow- 
ing digital collections; the process of 
metadata creation can more actively 
involve the contributors in collection 
development; and the contributors, 
as experts, can provide more accu- 
rate and granular access points. 33 
Unfortunately, according to both 
Chopey and Weibel, this rosy future 
has not been realized. 36 Weibel calls 
the prospect of self-archived meta- 
data seductive. 3 ' Attig, Copeland, 
and Pelikan contend that, in order 
to accommodate contributor-created 
metadata, the requirements for data 
entry must be kept modest at best. 38 
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Few traditional library catalogers 
have experience outside the MARC 
and Anglo-American Cataloging Rules 
paradigm. Data content standards for 
cultural objects were only recently for- 
malized with the 2006 publication of 
Cataloging Cultural Objects: A Guide 
to Describing Cultural Works and 
Their Images. 39 Bishoff and Meagher 
note that one of the major challenges 
of the CDP is the lack of catalog- 
ing expertise, which they consider "a 
problem for all types and sizes of 
institutions, not just the small libraries 
and historical societies." 40 They find 
that few catalogers participating in the 
program have experience analyzing 
and describing digital objects. Chopey 
observes that the level of granular- 
ity within digital collections is often 
higher than in library catalog. 41 

Caplan's Metadata Fundamentals 
for All Librarians provides an excellent 
introduction to a variety of metadata 
schema and serves as a springboard 
for analysis of available metadata stan- 
dards. 42 Caplan lays out the principles 
and practices that underlie most stan- 
dards and then applies these standards 
through critical descriptions of various 
families of metadata schemas. One 
of die metadata schemas that Caplan 
describes is Dublin Core (DC). This 
set of metadata elements was one of 
the products of an invitational metada- 
ta workshop held in Dublin, Ohio, at 
OCLC, the Online Computer Library 
Center, in March 1995. 43 The Dublin 
Core Metadata Initiative's (DCMI) 
element set has been selected for a 
multitude of metadata projects, pri- 
marily because it supports data map- 
ping and sharing, is Open Archives 
Initiative (OAI) compliant, and is 
designed for simplicity of use. 44 

Hider found that most responding 
libraries used some level of imple- 
mentation of DC in a 2004 Australian 
study of digital collections. 45 DC is the 
metadata element set of choice for 
the CDP to assure interoperability, 
aMiough some elements were modi- 
fied to facilitate the use of DC with 



digital surrogates of primary source 
materials. 46 The CDP developed a set 
of DC-based best practices that pro- 
vides one example of how to structure 
an application profile to describe a 
wide variety of resources in a complex 
consortial environment. 4 ' 

In a 2004 study of the usage levels 
of unqualified DC metadata elements 
in Open Archives Initiative Protocol 
for Metadata Harvesting (OAI-PMH) 
data providers, Ward found that only 
five of the fifteen elements are used 
most of the time and that more than 
half of the eighty-two data providers 
use only the creator and identifier 
elements. 48 According to Bruce and 
Hillmann, "Ward's study indicates that 
most metadata providers use only a 
small part of the DC element set, 
but her study makes no attempt to 
determine the reliability or usefulness 
of the information in those few ele- 
ments." 49 A 2001 survey of DC users 
by Guinchard indicated that most 
groups choose DC for its perceived 
international acceptance, the flexibility 
of the DC elements, and die probabil- 
ity of future interoperability with other 
metadata schemes. 00 Critics of the DC 
element set contend that the fifteen 
elements are too simplified and calls 
for expansion have led to the addition 
of optional qualifiers. Others handle 
the simplicity issue by including non- 
DC metadata in addition to DC ele- 
ments in their projects. In contrast to 
Ward's study, most of those surveyed 
by Guinchard use all fifteen DC ele- 
ments, lending weight to the argument 
that DC provides a solid foundation 
for metadata development. The find- 
ings support the need for usage guide- 
lines, and some survey participants 
even call for the development of a DC 
library application profile. 

Baca concludes that there is no 
"one-size-fits-all metadata scheme" 
and that therefore the first step is 
to select the appropriate metadata 
schema. 51 Cromwell-Kessler notes 
that metadata systems may be com- 
posed of different data elements func- 



tioning at different levels, in different 
ways. 52 Intner, Lazinger, and Weihs 
suggest that metadata schemas change 
because new schema develop that have 
new features, and that standard sche- 
ma are "nearly always preferred over 
customized or proprietary schemas 
that cannot be incorporated easily into 
a multi-institutional, multi-database, 
multi-community environment." 33 
According to Hider's 2004 survey of 
Australian digital information provid- 
ers, the top reasons for choosing a 
metadata format are: 

• most appropriate standard for 
nature of collection; 

• existing standard for non-digital 
collections; 

• community's favored standard; 

• government standard; 

• interoperability; 

• supported by system; 

• existing expertise in the stan- 
dard at the institution; 

• requirement for participation in 
a cross-institution project; and 

• simplicity. 54 

Defining Appropriate Metadata 
Using Application Profiles 

Developing application profiles is an 
important first step in defining appro- 
priate metadata. According to Agnew, 
"Implementing a core or root schema 
implies that one's organization will 
be developing an application profile 
for the schema. . . . Once one has 
determined the data elements to be 
used, the attributes of those data ele- 
ments, the order in which the data 
elements will display . . . and whether 
each element is repeatable, manda- 
tory or optional, it is time to document 
the application profile." 55 The DCMI 
Glossary defines an application profile 
(AP) as: 

a declaration of the metadata 
terms an organization, infor- 
mation resource, application, 
or user community uses in its 
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metadata. In a broader sense, 
it includes the set of metadata 
elements, policies, and guide- 
lines defined for a particular 
application or implementa- 
tion. The elements may be 
from one or more element 
sets, thus allowing a given 
application to meet its func- 
tional requirements by using 
metadata elements from sev- 
eral element sets including 
locally defined sets. 36 

Elements can be further refined 
or narrowed, but not changed. An 
application profile is not just a model 
for documentation or for formulat- 
ing guidelines; it also represents an 
approach to metadata that is much 
more flexible and responsive to local 
needs than is possible when simply 
adopting someone else's guidelines. 

Several reasons to use an applica- 
tion profile are presented by Neuroth 
and Koch. 5 ' An application profile pro- 
vides a standardized way to document 
the important decisions that have been 
made about the elements, including 
content standards and rules for use. 
Such documentation can facilitate 
migration, harvesting, and other auto- 
mated processes. A standard template 
for documentation makes it easier to 
maintain consistency across implemen- 
tations and can assist the development 
of an overall metadata strategy in the 
future. An application profile offers a 
systematic way of developing and shar- 
ing a data model. Because an applica- 
tion profile enables tracking across 
implementations to verify compliance, 
Heery and Patel suggest it "can pro- 
vide a basis for different metadata 
initiatives to work together." 08 

An application profile address- 
es local needs while still retaining 
desired levels of interoperability. 
Dekkers notes that the development 
of an application profile facilitates the 
use of multiple schemas because ele- 
ments can be selected from more than 
one existing schema or locally cre- 



ated and defined. 59 Guidelines unique 
to a given project or community of 
practice can be easily documented 
because, "An application profile is not 
considered complete without docu- 
mentation that defines the policies 
and best practices appropriate to the 
application." 60 Bruce and Hillmann 
assert, however, that application pro- 
files are more useful for specialized 
communities because "[application 
profiles, which by their nature are 
models created by community consen- 
sus, demand a level of documentation 
of practice that is rarely attempted 
by individual projects or implement - 
ers." 61 An application profile provides 
a framework for a fully developed set 
of guidelines that contributors can 
use as a reference or training guide 
for metadata creators. According to 
Brace and Hillmann, "Better docu- 
mentation at several levels has long 
been at the top of metadata practi- 
tioners' wish list. The first and most 
general improvement is in the appli- 
cation of standards." 62 Project and col- 
lection level application profiles, once 
archived and made publicly available 
in an application profile repository, 
can be used as resources for search 
terms and other project documenta- 
tion and by prospective contributors 
or other project implementers seek- 
ing information on projects similar to 
their own. 63 

Heery and Clayphan note that 
an application profile, in the form of 
meta-metadata, also addresses issues 
of data preservation. 64 In the same 
manner that technical metadata is 
required for the ongoing preserva- 
tion of digital objects, documentation 
of metadata in the standardized form 
of an application profile is needed 
for the preservation of metadata that 
inevitably will become vulnerable to 
corruption through the many versions 
and migrations that have come to be 
commonplace for digital collections. 

Application profiles can be cre- 
ated at different levels of abstraction, 
ranging from community of practice 



guidelines to project level implemen- 
tations. Three levels are in common 



• Discipline- or format-based 
communities of practice seek- 
ing to establish a standard set of 
guidelines specific to a certain 
discipline or format. Examples 
include die DCMI, die CanCore 
Learning Besource Metadata 
Initiative, and the Video 
Development Initiative. 65 

• Consortiums or other collab- 
orative groups seeking to estab- 
lish a common set of guidelines 
for their members. Examples 
include the CDP and Canadian 
Culture Online. 66 

• Local project implementers 
needing to document local 
practice, track project specific 
details, and ensure compliance 
with other standards. At this 
level, application profiles are 
often called data dictionaries 
and are somewhat different 
than a full application profile. 
These local level application 
profiles include less detail and 
are more prescriptive since they 
document all the final choic- 
es made for a specific instan- 
tiation. Examples include die 
University of Washington and 
Miami University. 67 

In "Metadata Principles and 
Practicalities," Duval et al. support 
using application profiles to facili- 
tate blending of metadata schemas to 
accommodate the functional require- 
ments of an application while 
maintaining a necessary level of 
interoperability with base schemas. 68 
They note, "Metadata modularity is a 
key organizing principle for environ- 
ments characterized by vastly diverse 
sources of content, styles of content 
management, and approaches to 
resource description." 69 By combin- 
ing established metadata schemas and 
observed best practice, a new appli- 
cation can be developed that meets 
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local requirements without sacrificing 
cross-domain interoperability. 

Deploying Appropriate Metadata 
via Institutional Repositories 

In 2002, the Association of Research 
Libraries' Scholarly Publishing 
and Academic Resources Coalition 
(SPARC) released "The Case for 
Institutional Repositories: A SPARC 
Position Paper," which envisioned an 
institutional repositoiy (IR) as a "stra- 
tegic response to systemic problems in 
die existing scholarly journal system." 
Lynch defines an IR as a "set of services 
that a university offers to the members 
of its community for the management 
and dissemination of digital materi- 
als created by the institution and its 
community members . " ' 1 Anuradha 
explains, "Institutional repositories 
(IR) are digital collections that cap- 
ture, collect, manage, disseminate, and 
preserve scholarly work created by 
the constituent members in individ- 
ual institutions. They are born out of 
problems with the current scholarly 
communication model developed by 
commercial publishers and vendors." 2 
SPARC characterizes uhese reposito- 
ries as being institutionally defined, 
scholarly, cumulative and perpetual, 
and open and interoperable. 3 

By studying the growth rates in 
the usage of electronic scholarly infor- 
mation, Odlyzko finds them sufficient- 
ly high to predict that "there will be 
no doubt that print versions will be 
eclipsed. ... To stay relevant, scholar's, 
publishers and librarians will have to 
make even larger efforts to make their 
material easily accessible."' 4 Allard, 
Mack, and Feltner-Reichert find that 
"the growth in literature demon- 
strates that institutional repositories 
are gaining in momentum through- 
out academia.'" 5 In a 2005 study of 
IR deployment in thirteen nations, 
Westrienen and Lynch witnessed a 
great diversity in IRs, and predict 
that deployment rates will continue 
to increase.' 8 Shearer acknowledges 



predicting the long-term success of 
the IR model is difficult." Chopey 
notes that successful implementations 
require broad collaborations of exper- 
tise as well as strong guidance from 
collection curators or compilers. 8 In 
addition, Lynch observes that the suc- 
cess of IRs depends on institutions 
recognizing IR as a serious and long- 
lasting commitment. 

Work of the Task Force 

The DMSC Task Force's examination 
of appropriate metadata, application 
profiles, and institutional repositories 
revealed challenges for consortial digi- 
tization projects such as integrating 
sometimes disparate collections using 
common metadata standards, choos- 
ing appropriate schemas, and creating 
good quality metadata. The next steps 
were to examine the metadata in uhe 
DMC, select a base schema, create 
a set of core metadata elements, and 
develop an application profile. The 
remainder of this paper details these 
decisions, providing recommendations, 
lessons learned, and conclusions. 

Metadata in the Digital 
Media Center 

The DMC was established in 1997 
using the Bulldog digital asset man- 
agement software. When the Task 
Force began investigating metadata, 
the DMC contained collections with 
an eclectic assortment of digital media 
files of multidisciplinary interest, each 
with its own unique metadata needs 
and issues. At the time of this wilting, 
the DMC contains more than 54,000 
digital images of art and architecture, 
more than 1,500 full-length educa- 
tional videos, and almost 4,000 items 
in six historic and archival collections. 
Contributions come from an array 
of Ohio institutions and arrive in a 
variety of formats including sound 
files, digital video, and various stan- 
dard imaging formats. Commercial 



collections — the Encyclopedia of 
Physics Demonstrations, LANDSAT 7 
Satellite Images of Ohio, Sanborn Fire 
Insurance Maps, Saskia Art Histoiy 
Images, and the ART Collection of 
art and archaeology objects — are also 
available through the DMC. Licensing 
agreements for these databases require 
OhioLINK to restrict access to indi- 
viduals associated with an OhioLINK 
member institution. 

Metadata for each collection was 
supplied by the OhioLINK contribu- 
tor, a commercial vendor, or harvested 
by the software. Subject terminologies 
specific to the genre of the collections, 
terms used by subject specialists, 
and terms familiar to patrons desir- 
ing access to particular- collections of 
digital media were used. Topical over- 
lap was minimal and the structures 
and specificity of the terminology var- 
ied widely. For example, terms used 
to describe the photographs in the 
Wright Brothers Collection were very 
different from those used to describe 
the videos in the Encyclopedia of 
Physics Demonstrations. 

The Bulldog software allowed 
keyword indexing of selected fields 
within each collection. This indexing 
was augmented by structured index 
fields from commercial media prod- 
ucts or adapted from the indexing sup- 
plied with a project. Descriptive terms 
for subject searches had to be selected 
from a pool of terms supplied with the 
software. The variance in initial meta- 
data and subject terminology resulted 
in the creation of separate databases, 
each wiuh metadata appropriate to a 
specific genre or discipline in addition 
to the more generic terms supplied by 
the software. 

The limitations of the software 
ultimately hindered searching of the 
DMC collections. Content in one col- 
lection could not be searched from 
within another collection, nor could 
users of the repository expect consis- 
tent application of subject terms or 
consistent search results across the 
collections. Though a common subject 
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thesaurus for the DMC was available, 
it was not apparent from the user 
interface, nor was the Bulldog thesau- 
rus available to users. By the time the 
Task Force was formed, a company 
called Documentum had acquired the 
Bulldog digital asset management soft- 
ware and was developing software that 
integrated document management, 
Web content management, digital 
asset management, and metadata with 
functionality to facilitate federated 
searching and data harvesting. Any 
new structure would have to address 
the quality, consistency, and compat- 
ibility of the metadata as well as access 
to the collections. After further exami- 
nation of the Documentum system, 
OhioLINK staff decided to look for an 
open source system that could handle 
the varied metadata formats, metadata 
cross-walks, library-specific protocols, 
and higher education standards need- 
ed in today's consortial environment. 

Legacy Data in the Digital 
Media Center 

From the beginning, the data struc- 
tures in the DMC were not apparent or 
consistent because of the nature of the 
information. These metadata were cre- 
ated for collections that were designed 
for different audiences and based on 
various metadata standards. The need 
for a cross-disciplinary core set of ele- 
ments was apparent. Every collection 
had unique fields and a few common 
fields that could be mapped to Dublin 
Core, the Visual Besources Association 
(VBA) Core, and the Collaborative 
Digitization Program Core. 80 Multiple 
types of data structures led to discrep- 
ancies between databases and with 
established standards. For example, 
the ABT Collection data did not fol- 
low the standard set by the VBA, and, 
according to the license agreement, 
the data had to be mounted as pro- 
vided. OhioLINK chose to accom- 
modate the needs of a wide variety of 
contributors rather than risk losing the 
projects. 



While all the databases contained 
a small number of similar fields, some 
databases included fields that did 
not apply to other databases. The 
Task Force prepared an analysis of 
metadata in each subject database 
to determine needs, characteristics, 
and problems. Initial efforts involved 
mapping existing DMC metadata and 
metadata from locally held collections 
not yet submitted to the DMC into 
one of several emerging metadata 
standards. The Task Force then com- 
pared the DMC elements to elements 
used by the Collaborative Digitization 
Program and Dublin Core. These 
efforts resulted in "The DMC Core 
Fields Analysis Document." 81 Further 
developments of this spreadsheet 
yielded initial assessments of wheth- 
er or not each metadata element 
appeared to be mandatory, required, 
or optional; whether or not the data 
field was repeatable; and notations of 
any issues that appeared to be associ- 
ated with use of the field. 

Selecting a Metadata Standard 

Cross-domain interoperability is a 
common theme throughout digital 
library research. Digital collections 
with different architectures, metadata 
formats, and underlying technologies 
need common protocols and standards 
in order to interact. The Task Force 
agreed that the future of the DMC 
collections and their growth would 
depend on finding and adopting a set 
of metadata standards that would be 
flexible enough to accommodate the 
needs of the individual OhioLINK 
digital collections while facilitating 
federated searching, a challenge in 
part because no one had examined 
the relationships between the DMC 
databases drat would facilitate feder- 
ated searching. Though procedures (in 
the format of a proposal form) were in 
place for submitting collections to the 
DMC, enforced standards or docu- 
mentation for establishing new data 
or metadata structures were not avail- 



able to contributors. 82 The Task Force 
anticipated that a core set of metadata 
elements accommodating existing and 
future collections must be developed 
to facilitate potential development and 
federated searching. This core set of 
elements would be anchored in meta- 
data standards and accompanied by 
a best practices document to assist 
data compatibility of future DMC 
collections. 

In the preceding few years, there 
had been an explosion in the growth 
and development of non-MABC 
metadata standards. The Task Force 
considered and rejected a variety of 
standards for adoption in the DMC. 
Some standards, such as Encoded 
Archival Description (EAD) and 
Metadata Object Description Schema 
(MODS) were rejected because they 
were deemed too complicated for 
non-cataloger contributors. 83 The Text 
Encoding Initiative (TEI) standard was 
not considered because of concerns 
with attaching the metadata directly to 
the digital object. 84 Several educational 
standards, including Sharable Content 
Object Beference Model (SCOBM), 
Learning Object Metadata (LOM) 
and Metadata for Education Group 
(MEG), were examined and deemed 
too specific for this project. 8,0 The VBA 
Core Categories also were discussed 
extensively, but were ultimately dis- 
carded as being too cultural object-ori- 
ented to accommodate the data. 86 The 
Task Force ultimately chose an appli- 
cation profile to document the current 
decisions and to provide the needed 
framework for a more fully developed 
set of guidelines in the future. 

Selecting a Base Schema 

The Task Force needed a base schema 
that would accommodate the hetero- 
geneous content of the DMC repre- 
sented by multiple formats, multiple 
subject areas, and multiple contribu- 
tors, and simultaneously support fed- 
erated searching and harvesting. The 
schema also needed to be interoper- 
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able with legacy data and be adaptable 
to change over time. Every effort was 
made to choose recognized authorita- 
tive sources in common use by the 
digital library community. After a care- 
ful review of emerging metadata sche- 
mas, best practice documents, and the 
DMC elements currently in use, the 
Task Force selected the DC schema 
as the basis for the core element set 
because it met the requirements of the 
DMC environment. 

The DC element set was devel- 
oped with the goals of interoperability, 
extensibility, and flexibility in mind. 
Interoperability is important for cross- 
domain discovery and harvesting. DC 
provides a high level of interoper- 
ability and thus would support feder- 
ated searching and harvesting. Other 
standards are too narrow to be applied 
across all of the DMC collections. 
The Task Force's work also indicat- 
ed that all manifestations of existing 
DMC metadata, as well as selected 
schemas used in non-DMC collec- 
tions at OhioLINK member institu- 
tions, could be mapped to elements 
in DC. Dublin Core Simple had been 
established as an international stan- 
dard, which increased the possibility 
that it would come into common use. 
DC was also the founda- 
tion of die OAI-PMH. 87 

According to Lagoze, 
"The OAI approach to 
metadata harvesting 
exemplifies the notion of 
metadata modularization, 
mandating simple Dublin 
Core metadata for cross- 
community interoperabil- 
ity while supporting, in 
parallel, community-spe- 
cific metadata for 'drill- 
down' searching within 
domains." 88 These trends 
are important because the 
larger the community of 
users for a single standard, 
the greater uhe opportu- 
nity for resource sharing 
through harvesting and 



cross-domain discovery. DC also sup- 
ports the creation of resource descrip- 
tions that are easy to produce and use, 
which is an important consideration for 
contributors without access to training 
or professional catalogers. 

Digital Media Center Core 
Metadata Elements 

The Task Force discussed numer- 
ous fields as possible core elements 
and the implications of including 
and excluding each in the applica- 
tion profile. These discussions were 
often long and sometimes contentious. 
Even though most members worked 
in libraries, a substantial difference of 
views existed regarding metadata and 
what steps should be pursued. In the 
end, the list was narrowed to twenty- 
two core fields including elements 
from DC and supplementary elements 
deemed necessary in the OhioLINK 
environment. Mapping to the DC ele- 
ment and the DC definition has been 
retained for those elements drawn 
directly from the DC element set. Any 
refinements have been made accord- 
ing to DCMI principles. Table 1 is a 
list of the core fields and their rela- 
tionship to the original, the digital 



manifestation, and OhioLINK asset 
management. The Task Force viewed 
these core elements as a starting point 
for institutions interested in creating 
metadata for the collections in the 
DMC. Each institution would have 
the option to use only the core fields 
or to include additional fields beyond 
the core to adequately describe their 
collections. The creation of subject- 
related sets of element extensions and 
additional fields would be possible at 
any time. 

The DMC Core contains six 
mandatory elements — Title, Creator, 
Digital Publisher, Asset Type, Object 
Identifier, and Permissions. Of 
these six elements, two are system- 
supplied — Asset Type and Object 
Identifier — and three are OhioLINK- 
specific — Asset Type, Object Identifier, 
and Permissions. By making Title, 
Creator, and Digital Publisher the 
only other mandatory elements and 
by demonstrating that metadata could 
be as simple or complex as a project 
warranted, the Task Force hoped to 
promote widespread adoption of the 
Core by DMC contributors. 

The Title element, defined as a 
name given to a resource, was the 
most difficult element to finalize. 



Table 1. DMC core elements 



Elements related to the original 
(regardless of format) 

Title* 
Creator* 
Contributor 
Date 

Description 
Subject 

Spatial Coverage 
Temporal Coverage 
Language 
Work Type 
Repository Name 
Repository ID 



Elements related to the 
digital manifestation 

Digital Publisher* 
Digital Creation Date 
Digitizing Equipment 
Asset Source 
Rights 



Elements related to OhioLINK 
asset management 

Collection Name 
OhioLINK Institution 
Asset Type* 

OID (Object Identifier)* 
Permissions* 



*Mandatory elements 

Source: OhioLINK DMSC Metadata Task Force, "OhioLINK Digital Media Center (DMC) Metadata Application 
Profile" (May 11, 2004), http://dmc.ohiolink.edu/docs/DMC_AP.pdf (accessed Aug. 11, 2006). 
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Title 

Definition: A name given to a resource. Typically a title will be a name by which the resource is 
known. It may also be an identifying phrase or object name supplied by the holding institution. 

Obligation: Mandatory 
Occurrence: Non-Repeatable 

Recommended Schemes: none. 



Input Guidelines: 

1 . Identify and enter one Title element per record according 
to the guidelines that follow. 

2. Transcribe title from the resource itself, such as book title, 
photograph caption, artist's title, object name, etc., using same 
punctuation that appears on the source. 

3. When no title is found on the resource itself, use a title assigned 
by the holding institution or found in reference sources. If title 
must be created, make the title as descriptive as possible, avoiding 
generic terms such as Papers or Annual report. Use punctuation 
appropriate for English writing. 

4. When possible, exclude initial articles from title. Exceptions 
might include when the article is an essential part of the title or 
when local practice requires use of initial articles. 

5. Capitalize only the first letter of the first word of the title and 
of any proper names contained within the title. 

6. Consult established cataloging rules such as Anglo-American 
Cataloguing Rules (AACR2) or Archives, Personal Papers, and 
Manuscripts (APPM) for more information. 



Examples: 

1 . Channel crew poling 
ice blocks 

2. DH4 battle plan and 
Wright Model C Flyer 
share air space 

3. Exhibition flight over 
Lake Erie 

4. Great Ballcourt 



Maps to DC Element: Title 



Figure 1 . Title element 



Although the Task Force agreed that 
Title should be mandatory, the occur- 
rence was revised more than once. The 
Task Force disagreed about whether 
or not Title should be repeatable or 
non-repeatable, and whether or not 
alternate titles should be included in 
the core elements. If alternate titles 
were included, should the alternate 
title be part of the Title element, thus 
requiring Title to be repeatable, or 
a separate element? If alternate title 
was a separate element, should it be a 
core field? All of these decisions had 
to be in place before the input guide- 
lines could be finished and the Title 
element finalized. The Task Force 
eventually decided to make the Title 
element non-repeatable and to include 
any other titles in the additional fields. 
Additional fields are non-core fields 
needed for a specific project and are 
beyond the scope of the application 



profile document. Figure 1 shows the 
Title element. 

The second mandatory element 
is Creator, which includes authors, 
artists, photographers, collectors, or 
organizations primarily responsible for 
producing the content of the resource. 
Entities with a secondary role in the 
creation process such as editors, illus- 
trators, and preformers are included in 
the optional Contributor element. Both 
Creator and Contributor are repeat- 
able fields. Project implementers are 
instructed to enter names according to 
established rules (for example, Anglo- 
American Cataloguing Rules, 2nd 
ed. (AACR2), and Archives, Personal 
Papers, and Manuscripts) or use 
the guidelines outlined in the DMC 
Metadata Application Profile. 89 The 
General Input Guidelines state that 
the same rules or guidelines should be 
used for names throughout the project 



profile. The recommended scheme 
for both elements is the Library of 
Congress Authorities file. 90 

The Date element contains the 
creation or modification date or 
dates of the original resource. Date 
is required (if applicable) and repeat- 
able. A resource may have several 
dates associated with the original 
resource such as creation date, copy- 
right date, revision date, and modi- 
fication date. The Digital Creation 
Date element records the date of 
creation or availability of the digital 
resource and may be approximated by 
the agency of creation. This element 
is required (if available) and non- 
repeatable. Date maps to DC. date 
while Digital Creation Date maps to 
DC. date. available, a refinement of 
DC. date. The recommended scheme 
for both elements is ISO 8601, the 
International Standard for the repre- 
sentation of dates and times. 91 

The Description element is an 
account of the content of the resource 
and may include an abstract, table 
of contents, provenance, or other 
descriptive text. The Description ele- 
ment holds specialized information 
that is not included in other elements. 
Description is required (if available) 
and repeatable. The Subject element, 
or topic of the content of the resource, 
is required (if available) and repeat- 
able. The application profile strongly 
recommends selecting a value from, 
or creating values according to, a con- 
trolled vocabulary, name authority 
file, or formal classification scheme 
to ensure consistency, reduce spell- 
ing errors, and improve the quality of 
search results. Examples include the 
Library of Congress Subject Headings 
(LCSH), Medical Subject Headings 
(MeSH), and the Thesaurus for 
Graphic Materials I: Subject Terms. 92 

Spatial Coverage describes the 
location or locations covered by the 
intellectual content of the resource, 
not the place of publication. Examples 
include place names, longitude, and 
latitude. Recommended schemes for 
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Spatial Coverage include the Getty 
Thesaurus of Geographic Names, 
DCMI Box, DCMI Point, ISO 3166, 
and LCSH. 93 Temporal Coverage 
refers to the time period covered by 
the intellectual content of the resource, 
not the date of publication or digi- 
tal creation date. The recommended 
schemes for Temporal Coverage are 
ISO 8601 and LCSH. Both coverage- 
related elements are optional, repeat- 
able, and map to DC. Coverage, which 
includes refinements for spatial and 
temporal coverage. 

The Language element records 
the language of the intellectual content 
of a resource and is required (if avail- 
able) and repeatable. Some resources 
may contain multiple languages while 
others, such as images, may not con- 
tain a language component at all. The 
recommended scheme for Language 
is ISO 639-2, a three letter code set 
for the representation of names of 
languages. 94 

Work type refers to the manifes- 
tation of the original element and is 
required (if available) and repeatable. 
The application profile suggests apply- 
ing terms from an established scheme 
such as the Art and Architecture 
Thesaurus or the Thesaurus for 
Graphic Materials II: Genre and 
Physical Characteristics to ensure con- 
sistent usage. 95 Asset Source records 
the immediate parent or manifestation 
of the digital object and often will be 
the same as Work Type. This element 
is optional and repeatable. 

Bepository Name lists the orga- 
nization or institution that holds the 
original physical object, if applicable. 
Bepositoiy ID holds a number or other 
identifier for the resource from which 
the present resource was derived, such 
as a local accession number. Both of 
these elements are optional, because 
some digital resources do not have a 
repository, and both are repeatable. 
The Collection Name element records 
the formal or informal group of objects 
to which the item belongs. This ele- 
ment is optional and repeatable. 



The Digital Publisher is defined 
as the entity responsible for making 
the resource available to OhioLINK. 
Examples include an academic depart- 
ment, corporate body, publishing 
house, or museum. This element is 
mandatory and repeatable. If Digital 
Publisher is the same as Creator or 
Contributor, the application profile 
instructs users to enter the information 
in both elements. This element may or 
may not be related to the entity listed 
in the OhioLINK Institution element, 
which is a consistent reference to the 
OhioLINK member that contributes 
the material. OhioLINK Institution 
is required (if available) and repeat- 
able. Like Creator and Contributor, 
the recommended scheme for Digital 
Publisher is the Library of Congress 
Authorities File. 

The Digitizing Equipment ele- 
ment records the equipment or tools 
used to create the digital object. This 
element is optional and repeatable. 
The Bights element records informa- 
tion about rights held in and over 
the resource. This optional, repeatable 
field typically contains a rights man- 
agement statement for the resource 
or a reference to a service provid- 
ing the information. Bights informa- 
tion often encompasses Intellectual 
Property Bights (IPB), copyright, and 
property rights. The application pro- 
file states that if the rights element is 
absent, no assumptions may be made 
about any rights held in or over the 
resource. The Permissions element 
lists the audience that the publisher 
agrees to allow access to the content. 
This mandatory, non-repeatable ele- 
ment has three options — world, state 
of Ohio, or OhioLINK. 

Asset Type records the manifes- 
tation of the resource. The software 
automatically captures this manda- 
tory, non-repeatable element. Values 
include image, audio, video, or text, 
and related properties such as file 
format, file size, and dimensions. This 
element maps loosely to both DC. 
Type and DC. Format. Object Iden- 



tifier (OID) is a mandatory unique 
identifier automatically assigned to the 
digital object that is subsequently used 
to form a persistent UBL. 

Creating the OhioLINK 
Application Profile 

Each element in the application pro- 
file contains eight different specifica- 
tions. Four of the specifications are 
presented in the condensed view of 
the DMC Core elements in table 2. 
"Element Name" represents a sin- 
gle characteristic or property of a 
resource. The "Definition" specifies 
the type of information required for 
the named element. In most cases 
definitions are taken directly from the 
Dublin Core Element Set. A definition 
may also contain comments providing 
additional information or clarification. 
"Obligation" indicates whether or not 
a value must be entered. Three types 
of obligations are used in this applica- 
tion profile. "Mandatory" is defined 
as a value that must be entered even 
if it requires the creation of an arbi- 
trary value. "Bequired (if available)" is 
defined as a value that must be includ- 
ed if it is available. "Optional" means 
that it is not necessary to include a 
value for this element. "Occurrence" 
indicates whether a single value or 
multiple values can be included. Two 
occurrences are used in die DMC 
Core — repeatable and non-repeatable. 

"Becommended Schemes" refers 
to established lists of terms or clas- 
sification codes from which a user 
can select when assigning values to 
an element. Two types of schemes — 
vocabulary-encoding schemes, which 
are controlled lists of words such 
LCSH, and syntax encoding schemes, 
which indicate that the value must 
be formatted in accordance with a 
formal notation, such as how a date is 
to be entered — may be used. "Input 
Guidelines" list common conven- 
tions and syntax rules used to guide 
the data-entry process. In the case 
of system-supplied elements, a brief 
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Table 2. DMC core elements (condensed view) 



Element name 


Obligation 


Occurrence 
of values 


Mapping 


Title 


Mandatory 


Non-Repeatable 


DC.title 


Creator 


Mandatory 


Repeatable 


DC. creator 


Contributor 


Optional 


Repeatable 


DC. contributor 


Date 


Required (if available) 


Repeatable 


DC.date 


Description 


Required (if available) 


Repeatable 


DC. description 


Subject 


Required (if available) 


Repeatable 


DC. subject 


Spatial Coverage 


Optional 


Repeatable 


DC. coverage. spatial 


Temporal Coverage 


Optional 


Repeatable 


DC. coverage. temporal 


Language 


Required (if available) 


Repeatable 


DC. language 


Work Type 


Required (if available) 


Repeatable 


DC.type 


Repository Name 


Optional 


Repeatable 


n/a 


Repository ID 


Optional 


Repeatable 


DC. source 


Digital Publisher 


Mandatory 


Repeatable 


DC. publisher 


Digital Creation Date 


Required (if available) 


Non-repeatable 


DC. date. available 


Digitizing Equipment 


Optional 


Repeatable 


n/a 


Asset Source 


Optional 


Repeatable 


DC. relation. HasFormat 


Rights 


Optional 


Repeatable 


DC.rights 


Collection Name 


Optional 


Repeatable 


DC. relation 
DCrelation.IsPartOf 


OhioLINK Institution 


Required (if available) 


Repeatable 


n/a 


Asset Type 


Mandatory (system supplied) Non-repeatable 


DC.format 
DC.type 


OID (Object Identifier) 


Mandatory (system supplied) Non-repeatable 


DC. identifier 


Permissions 


Mandatory 


Non-repeatable 


n/a 



Source: OhioLINK DMSC Metadata Task Force, "OhioLINK Digital Media Center (DMC) Metadata 
Application Profile" (May 11, 2004), http://dmc.ohiolink.edu/docs/DMC_AP.pdf (accessed Aug. 11, 
2006). 



explanation of the process is provided. 
Two types of input guidelines are pro- 
vided — general and element-specific. 
General guidelines that apply to more 
than one element are located near the 
beginning of the application profile to 
cut down on repetition and length of 
the document. Input guidelines spe- 
cific to an element are located on the 



page for that element. "Examples" are 
provided for each element to illustrate 
the types of values, conventions, and 
syntax used for the element. "Maps to 
DC Element" gives the DC element 
equivalent, if applicable. 

Input guidelines are included to 
provide a relatively simple way to pro- 
mote data consistency and assist with 



data creation while still allowing some 
flexibility. The application profile was 
created to accommodate an audience 
beyond catalogers and others famil- 
iar with metadata creation. The Task 
Force attempted to anticipate ques- 
tions and to help those unfamiliar with 
the metadata process plan their proj- 
ects by providing decision points up 
front. While anticipating all situations 
was impossible, every effort was made 
to assist contributors in metadata cre- 
ation. External content standards are 
also referenced as appropriate. 

Current Status of the Digital 
Media Center and the 
Application Profile 

New collections are no longer being 
added to the DMC and the collec- 
tions contained in the DMC are being 
migrated to a new platform called the 
Digital Resource Commons (DRC), 
funded by a 2003 Technology Initiatives 
grant from the Ohio Board of Regents. 
The OhioLINK DRC is part of the 
Ohio Commons for Digital Education, 
a collaborative effort by OhioLINK, 
the Ohio Learning Network, and the 
Ohio Supercomputer Center/OARnet 
to develop digital education resourc- 
es, services, and capabilities in Ohio. 
As part of the DRC, OhioLINK is 
building a general-purpose digital 
object repository that will accept and 
share a wider variety of collections 
and digital objects than the DMC 
can accommodate. The DRC will be 
a collection of research and course- 
ware digital repositories connecting 
to a wide array of existing systems, 
including Collaborative Learning 
Environments, portals, and integrated 
library systems. 

All OhioLINK member institu- 
tions are entitled to contribute con- 
tent to the DRC, eliminating "the 
need for redundant and costly local 
investments by enabling Ohio colleges 
and universities to utilize OhioLINK's 
hardware, software, and staff to create 
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their own repositories." 96 Individual 
repositories are customizable, allowing 
institutions to define how content is 
contributed and presented. The con- 
tributing institutions maintain owner- 
ship of the work and control access, 
allowing rapid dissemination to world- 
wide audiences or to a single person. 
The DRC will enhance the quality of 
education by providing a shared point 
of access to Ohio's scholarly knowl- 
edge. Students will have "a versatile 
resource for sharing and showcasing 
. . . research projects as well as access- 
ing course materials, research and 
learning objects to support their learn- 
ing." 9 ' Further collaborations between 
OhioLINK, Ohio's K-12 community, 
and other Ohio institutions will enable 
the DRC to be a foundation of the 
Ohio education system in the twenty- 
first century. 

The transfer of DMC collections 
to the new DRC platform is sched- 
uled to be completed by March 2007. 
The application profile developed by 
the DMSC Metadata Task Force and 
described in this paper will continue to 
be a foundational document for proj- 
ect development in the new system. 
Contributing members are encour- 
aged to use the application profile 
during the planning and implementing 
stages of new projects. The current 
application profile will be updated in 
to reflect the DRC environment. 



Final Recommendations 

Eight recommendations were pre- 
sented to the OhioLINK Database 
Management and Standards Commit- 
tee centering around three broad 
categories: the need for continued 
leadership, a call for high-quality 
metadata development, and the neces- 
sity of knowledge sharing. The first 
recommendation addressed the need 
for leadership, oversight, coordination, 
and continuity for DMC metadata. 
The Task Force recommended that 
the DMSC develop and document 



an overarching metadata strategy to 
provide a framework for all the meta- 
data related initiatives at OhioLINK. 
Furthermore, the Task Force recom- 
mended that OhioLINK form a body 
to coordinate metadata-related proj- 
ects and initiatives, to guide software 
and tool development, to facilitate 
metadata harvesting and federated 
searching, and to keep OhioLINK 
metadata documentation up-to-date. 

The Task Force recognized that 
the identification of a core set of 
metadata elements is only a first step. 
The need for high-quality metada- 
ta development will increase in the 
future. Therefore, the Task Force rec- 
ommended that OhioLINK develop 
extended element sets with supporting 
documentation for various subject and 
format areas. The Task Force also rec- 
ommended that OhioLINK develop 
policies to address legacy data issues 
to ensure continued usability of older 
collections. 

A group of recommendations 
addressed issues of training, market- 
ing, and knowledge sharing. The Task 
Force recommended that OhioLINK 
host a workshop or conference on 
metadata and digital collection prac- 
tices where participants would begin 
to form a viable OhioLINK metadata 
practice community. Concurrently, the 
Task Force recommended the creation 
of an electronic discussion list for shar- 
ing information among this emergent 
community and current DMC/DRC 
contributors. 

The Task Force proposed the cre- 
ation of an online, locally developed, 
wizard-type tool to assist digital col- 
lection managers with project plan- 
ning. After some mildly heated, mostly 
humorous debate about what to call 
this tool, the name "MetaBuddy" was 
chosen. In concept, MetaBuddy is an 
interactive version of the OhioLINK 
application profile that could help 
potential contributors determine the 
metadata needs of the collection in 
question. MetaBuddy would lead the 
project manager through the applica- 



tion profile, facilitating the preliminary 
mapping of existing data structures 
to the core metadata elements. The 
collection-specific application profile 
created in MetaBuddy would then 
assist OhioLINK programmers with 
the data mapping of the local col- 
lection into the DMC or DRC. The 
online tool would promote the use 
of the application profile through its 
ease of use and adaptability to local 
needs, promote the use of the DMC 
or DRC to mount digital collections, 
and ensure that the standards in the 
application profile provide consistent, 
reliable access to OhioLINK's digital 
collections. The MetaBuddy online 
tool is currently in development. 

The final recommendation 
addressed the need to expand knowl- 
edge of the DMC and DRC through- 
out the OhioLINK community. The 
Task Force saw a need to develop and 
implement a formal marketing strategy 
to recruit contributors and content and 
increase end-user awareness and use. 
The OhioLINK Database Management 
and Standards Committee is repre- 
sented on the steering committee of 
the DRC and the development of the 
repository is being closely monitored. 
DMSC members are currently dis- 
cussing opportunities to increase the 
awareness and use of the DRC. 



Lessons Learned 

The Task Forces work was accom- 
plished over twenty months. During 
that time, a group of people from 
different institutions and backgrounds 
collaborated to build a foundation for 
OhioLINK digital collections metada- 
ta. Many lessons were learned. Here 
are a few of the most significant: 

• Standards are still important. 
Like anything that requires a 
certain level of compatibility 
between systems, metadata is 
standards-driven. Standards 
provide the foundation for 
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interoperability. Anyone who 
wants to increase access to their 
digital collections — whether 
through a collaborative proj- 
ect, metadata harvesting, or 
Google — needs to be aware of 
a variety of metadata-related 
standards. 

• Standards do not eliminate the 
need for local decisions. An 
application profile can help nar- 
row the choices by making rec- 
ommendations and providing 
guidelines. However, local deci- 
sions will still need to be made 
for each project. 

• It is not necessary to reinvent 
the wheel with eveiy project. 
Even though local decisions 
need to be made for each proj- 
ect, most projects will have com- 
mon aspects. Find an example 
of a locally defined application 
profile or data dictionary for a 
similar project and adapt it. 

• The best and worst thing about 
metadata is that it does not 
come with content standards. 
Traditional MARC is a pack- 
age deal, complete with a set 
of standards that are designed 
to work for everyone. Few 
people would think of using 
MARC with a standard other 
than AACR2. The same can 
not be said about nontraditional 
metadata. One can pick and 
choose from a variety of content 
standards or even create a local 
variation. This freedom is good 
when trying to meet locally 
defined needs; it is bad when 
aiming for interoperability. 

• The metadata universe is 
large and subject to change. 
This might be stating the obvi- 
ous, but keeping this in mind 
when planning a new project 
is important. Standards are 
supposed to provide a certain 
amount of stability and users 
may be tempted to become 
complacent. However, one 



should remember that metada- 
ta is standards-based, and new 
standards and technologies are 
rapidly appealing on die scene 
that will need to be reconciled 
with the current standards and 
technologies. No matter what 
standards are adopted, being 
aware of new developments is 
important. If collections are to 
be accessible now and in the 
future, metadata cannot be cre- 
ated in a vacuum. 

• Metadata can be as simple or as 
complex as wanted or needed. 
Ideally, the need for interoper- 
ability, which requires a core of 
universal elements, is balanced 
with the needs of a specific col- 
lection, project, or community. 
One way this can be accom- 
plished is through the use of 
application profiles and extend- 
ed element sets. However, 
research shows that few small 
or independent projects with 
limited resources have applica- 
tion profiles. Remember that 
any attempts to standardize the 
metadata will help with informa- 
tion retrieval and limited access 
is better than no access. 

• Having a cataloging background 
is useful. The group deci- 
sion-making process is com- 
plex. Catalogers bring certain 
assumptions to the table about 
the importance of standards 
and guidelines uSat can jump- 
start die metadata process, even 
if they have little knowledge of 
non-traditional metadata. 

• Identifying a set of core ele- 
ments is an important first step, 
but it is only the first step. The 
work accomplished thus far will 
serve as a foundation for related 
initiatives within the OhioLINK 
community. Continued refine- 
ment and expansion of this 
work must continue to meet 
the changing needs of the con- 
sortia! community. 



Conclusion 

After five years of expansion, the 
OhioLINK DMC metadata needed 
some standardization to facilitate 
access to the collections and future 
growth. Although procedures to sub- 
mit new collections were in place, no 
metadata standards or guidelines were 
available to assist contributors. One of 
the tenets of the DMC was to eliminate 
barriers to institutional participation. 
The legacy of this principle demon- 
strates one challenge facing consortial 
repositories. A series of subject-spe- 
cific databases based on various meta- 
data standards had been created for 
different audiences. This variety of 
resources ultimately hindered access 
to more than one collection at a time. 
A Task Force was appointed by the 
OhioLINK Database Management 
and Standards Committee to investi- 
gate metadata schema and best prac- 
tices documentation. While the Task 
Force was unable to discover stan- 
dards and best practices that could be 
adopted wholesale by OhioLINK for 
the DMC, the examination of vari- 
ous best practices documentation and 
standards helped define a core of 
cross-disciplinary metadata elements. 
The development of the OhioLINK 
DMC Metadata Application Profile 
and subsequent recommendations by 
the Task Force helped lay a foun- 
dation for the creation of quality, 
consistent, and compatible metadata 
for future collections contributed to 
OhioLINK's online repositories. This 
application profile will help define 
projects, schemas, and standards for 
the new OhioLINK DRC to facili- 
tate access for users and training for 
contributors. 
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Notes on Operations 

FRBR Principles Applied 
to a Local Online Journal 
Finding Aid 

By Chew Chiat Naun 

This paper presents a case study in the development of an online journal finding 
aid at the University of Illinois at Urbana-Champaign (UIUC), with particular 
emphasis on cataloging issues. Although not consciously designed according to 
Functional Requirements for Bibliographic Records (FRBR) principles, the 
Online Research Resources (ORR) system has proved amenable to FRBR analy- 
sis. The FRBR model was helpful in examining the user tasks to he served by the 
system, the appropriate data structure for the system, and the feasibility of map- 
ping the required data from existing sources. The application of the FRBR model 
to serial publications, however, raises important questions for the model itself, 
particularly concerning the treatment of work-to-work relationships. 
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The University of Illinois at 
Urbana-Champaign 's (UIUC) 
Online Research Resources (ORR) 
registry (www.library.uiuc.edu/orr) is 
a database-driven, alphabetical list of 
online resources similar in principle to 
comparable lists provided by vendors 
such as Serials Solutions and TDNet. 
ORR is, in effect, an alternative or 
supplementary catalog for specialized 
access to online resources, especially 
electronic journals. Like other tools 
of its kind, it was designed partly to 
overcome some of the drawbacks of 
online catalogs in dealing with this 
class of material. Antelman says that 
such tools "are potential sources of 
innovation because they are amenable 
to experimentation in ways that our 
current integrated library systems are 
not." 1 UIUC's experience with ORR is 
a case study of a home-grown system 
built to local specifications. 

ORR was not the first system of its 
kind developed by the UIUC library. 
An earlier electronic resources registry 
had been in existence for some years, 
but die acquisition of a data feed from 
TDNet in 2003 provided the impetus 
to redevelop the service. The TDNet 
service monitors a range of provid- 



ers and notifies the library of any 
change either in content or location 
(URL, or uniform resource locator). 
Although TDNet normally supplies 
a public interface, the library chose 
to develop its own. The development 
work was undertaken by die library's 
systems office with die guidance of 
a committee comprised of staff from 
systems, public services, and technical 
services. The new version was built on 
a redesigned data structure capable 
of incorporating additional data from 
external sources. While the redevelop- 
ment was primarily intended to facili- 
tate maintenance of the data by library 
staff, it also made possible significant 
improvements in the public interface. 

ORR is not intended to be a com- 
prehensive catalog of the library's elec- 
tronic holdings. Its scope is limited 
to online article databases, journals, 
and reference works. The majority 
of electronic books were excluded on 
the principle that book-like objects 
were more appropriately represented 
in the library's online catalog. Each of 
the categories of resources covered by 
ORR presented its own metadata and 
interface design challenges. However, 
the most urgent — and in some ways 
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the most complex — task for the devel- 
opers of ORR was to facilitate access 
to online journals. Jones describes this 
class of publications as "the subset [of 
continuing resources] characterized 
by issues containing contributions by 
individual authors, the subset that is 
most often analyzed in abstracting and 
indexing services." 2 They will be the 
main focus of this paper. 

At the time of writing, ORR listed 
42,640 online journals, 1,344 refer- 
ence works, and 439 article databases. 
These totals are for unique titles; the 
number of unique URLs is much 
higher. In each of the two years ORR 
has been operational, it has logged 
between four and five million hits, 
counting only links through to full-text 
content. ORR is a key resource for the 
library's patrons. 

The technical and logistical aspects 
of ORR's development have been 
described by German, Shelburne, and 
Norman, and the reader is invited to 
consult their publications for addi- 
tional information about this project. 3 
The literature on the management of 
online journals is extensive, including 
the provision of access through online 
journal finding lists similar to ORR. 
Although several articles provide illu- 
minating details about the database 
structures employed in these systems, 
relatively little appears to have been 
published dealing with the biblio- 
graphic relationships in particular. 4 

This paper examines the data 
structures employed in ORR with 
respect to bibliographic relation- 
ships among serial works, versions, 
and aggregates. These relationships 
are described in the International 
Federation of Library Association's 
(IFLA) Functional Requirements 
for Bibliographic Records (FRBR). 5 
Although ORR was not designed with 
the FRBR model specifically in mind, 
its development was informed by 
many of the same considerations that 
underlie that model. The FRBR model 
provides a context to understand spe- 
cific decisions made in creating ORR, 



including the compromises involved 
and areas where improvements may 
be sought in future. 

This paper does not attempt to 
cover all aspects of ORR's design. 
For example, since its launch, ORR 
has been augmented with a rights 
management module and now more 
closely resembles a comprehensive 
electronic resources management sys- 
tem. These newer developments, and 
their relationship to the cataloging 
data in ORR, are beyond the scope of 
the present paper. 

User Tasks 

The FRBR report ascribes to the end- 
user the following tasks: to find docu- 
ments matching a given set of criteria, 
to identify those that are relevant, to 
select the desired or available ver- 
sions, and to obtain them. 6 FRBR also 
recognizes the need in some contexts 
to navigate between resources.' This 
breakdown is useful for understand- 
ing uhe purposes served by various 
elements of ORR's design. Before 
attempting an analysis, one must ask 
the question: exactly what is the user 
supposed to be trying to find, identify, 
select, and obtain? 

ORR is primarily concerned with 
bibliographic control and access at the 
level of the serial publication, not at 
the level of the individual article. This 
emphasis reflects that of the traditional 
library catalog, where Tillett observes, 
"We cannot afford to always describe 
and identify every work although that 
maybe the 'ideal' — (sometimes leaving 
such levels to abstracting and indexing 
services, sometimes to bibliographies, 
finding aids, and reference tools).'" 

This point is well understood 
by librarians, but not self-evident to 
patrons. Most of the time, what a 
patron is interested in is the specific 
content of a journal article, and an inex- 
perienced patron naturally approaches 
a tool like ORR with the expectation 
of finding individual articles directly. 



As Antelman puts it, "library users' 
sense of a serial work diverges sig- 
nificantly from the way it is currently 
implemented in library systems." 9 The 
identity of a serial work is not always 
a matter of indifference to the end 
user. To look no further than their 
utilitarian role, scholarly journals are 
an institutionalized part of the system 
for scholarly dissemination, review, 
and recognition. ORR includes at least 
two data elements at the serial-work 
level that reflect this role: the Institute 
for Scientific Information impact fac- 
tor for each title and its peer review 
status. Nonetheless, the serial title is 
the primary unit of representation in 
ORR because it helps users obtain 
relevant documents. The serial pub- 
lication is the vehicle of distribution 
for article content. Data relating to 
the manifestations and copies (or, in 
FRBR terminology, items) of the serial 
publication, including the URLs for 
available sources, coverage dates, and 
(for print holdings) location and call 
numbers, enable patrons to obtain 
copies of the articles they seek. 

The role that ORR plays in sup- 
porting user tasks may be better 
understood in the context of concur- 
rent plans at the UIUC library to 
introduce a broadcast search facility 
and link resolver. Broadcast searching 
will facilitate finding and identification 
tasks by enabling users to search for 
articles and citations in multiple data- 
bases simultaneously. The link resolver 
will act, where required, as a bridge 
between the results found in these 
databases, whether searched simulta- 
neously or separately, and the selection 
and obtaining tasks jointly supported 
by the link-resolution knowledge base, 
serials management system, library 
catalog, and document delivery ser- 
vice. Part of the original plan for 
ORR was to serve as a knowledge 
base for reference linking. This plan 
was later modified when the library 
decided to acquire a commercial link 
resolver with its own knowledge base. 
Although not completely integrated, 
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these systems will provide mutually 
complementary access to the library's 
online collections. 

ORR serves the core function 
of allowing users to find and identify 
known serial publications by title, view 
details of all the available sources 
(including coverage dates and access 
restrictions), and make an appropriate 
selection from among them. The dif- 
ficulty is that no existing database con- 
tains all the requisite data. Even where 
the data are available, the necessary 
linkages between them do not always 
exist. For example, if a complete set 
of back files is not available for a given 
journal, the data necessary to locate 
a library's print holdings may not be 
obvious from a vendor's database. In 
order to support the desired user tasks, 
a system has to pull together disparate 
data from a range of otherwise unre- 
lated sources and assemble them into 
a structure that will make displaying 
and navigating the relevant relation- 
ships possible. The data must be easily 
maintained so that keeping ORR com- 
plete and up to date is practical. 

ORR is designed around a strat- 
egy that satisfies both of these require- 
ments. The strategy is, where possible, 
to import data from existing sources 
and use them to populate ORR records 
according to a quality hierarchy at the 
level of each individual field. The 
quality hierarchy ranks the preferred 
sources for each data element, an 
approach that allows the database to 
be populated to the fullest extent pos- 
sible, while ensuring that data in each 
field are drawn from the most authori- 
tative, complete, or current source. 
Thus, while titles are available from 
both TDNet and the Voyager integrat- 
ed library system (ILS) data feeds, the 
ILS source is preferred for title infor- 
mation. Conversely, the TDNet data 
receive priority for URLs. Automated 
processes alone cannot ensure the 
completeness and integrity of all the 
data; maintaining ORR still requires 
manual data entry and cleanup. 



Functionality and Data 

The biggest drawback of the UIUC 
online catalog as a discovery tool for 
the library's online collections is its 
size. At the time of writing, it contains 
just fewer than five million biblio- 
graphic records, compared to 45,000 
titles listed in ORR. Although searches 
may be scoped in various ways, the 
proportion of unwanted hits inevitably 
remains high. 

The online catalog's functional- 
ity has significant limitations, as well. 
Rs proprietary design makes updat- 
ing using external non-MARC data 
sources difficult, particularly with the 
degree of granularity needed. Certain 
entities conceptually important to the 
management of electronic resources, 
such as content providers, are difficult 
to represent adequately within the 
confines of the MARC format. The 
online catalog's ability to manipulate a 
variety of data into a desired Hypertext 
Markup Language (HTML) display 
format is strictly limited. R can collo- 
cate alternative versions of a title only 
to the limited degree that the Anglo- 
American Cataloging Rules, 2nd ed. 
(AACR2) record structures and the 
systems own relatively inflexible fil- 
ing and display algorithms permit it to 
do so. 10 Although the MARC format 
has provisions for linking between 
alternative versions of a work and 
between successive titles in a journal's 
history, these linking mechanisms are 
only imperfectly implemented in the 
catalog's public interface. 

The advantages of ORR as an 
alternative to the online catalog may 
be seen from a brief outline of tire 
functionality of the system's public 
interface and some of its specific 
data elements and design features. 
Users consulting ORR may search 
all resources together, or scope their 
searches by resource type, the latter 
being recorded in a field in the ORR 
resource record (figure 1). Titles may 
be searched for an exact match with 



implied right truncation or by key- 
word, with a further option for implied 
truncation of each word within the 
title. This latter option is particularly 
useful for finding abbreviated titles. 
Searches match on variant titles as 
well as titles proper, thanks to catalog- 
ing data pulled in from the 24X title 
fields of MARC records. 

The interface also allows the user 
to navigate between earlier and later 
titles in the serial work's history, pro- 
viding linked title displays drawn again 
from MARC data, in this case from the 
780 (previous title) and 785 (succeed- 
ing title) linking entry fields. Certain 
other work-level data elements are 
drawn from a variety of potential 
sources. International Standard Serial 
Number (ISSN) data, for example, 
are compiled opportunistically from 
TDNet, EBSCO, ILS, and Ulrichs 
Periodicals Directory. Ulrich's is also 
the usual source for the ISI impact 
factor and the peer-review status. 

ORR's approach to subject access 
reflects the same priority given to 
access at the serial-work level. The 
decision was made very early not to 
offer generic keyword searching of the 
database, in spite of the prevalent prac- 
tice of supplying a keyword option in 
almost any context. It was decided that 
keyword searching was suited main- 
ly to the fine-grained subject access 
and article-level retrieval offered by 
article and citation databases. The 
broadcast search interface would be 
the appropriate place to encourage 
generic keyword searching. In con- 
trast, the ORR interface was designed 
to allow very broad subject browsing 
using an in-house subject descriptor 
list. To assist in the assignment of 
these subject descriptors, the ORR 
database performs mappings from 
Library of Congress Subject Headings 
(LCSH) so that the UIUC descrip- 
tors are derived automatically on the 
basis of data in MARC records. UIUC 
reference librarians may change or 
add descriptors. They also may add 
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Figure 1. ORR's main search page 



natural-language descriptions view- 
able by the public for some ORR 
resources (figure 2). 

The database schema for ORR 
may be seen in figure 3. The data 
elements previously mentioned reside 
in uhe resource record, which is one 
of ORR's three main building blocks 
(the odier two are die instance record 
and the interface record). These are 
invoked to display relevant information 
about the online sources or provid- 
ers available for each title (including 
the dates covered by each source), 
any access conditions that may apply, 
and other related information such as 
current availability. ORR's public dis- 
play groups information about sources 
directly under the entry for the relevant 
title. ORR thus offers a hierarchical 
display that is supported by a hierar- 
chical record structure (figure 4). The 
user may toggle to a detailed display 
(figure 2). These displays are similar to 
the grouped catalog displays advocated 
by commentators such as Yee. 11 

The system has three building 
blocks but only two levels in the dis- 
play. Most of the pertinent data at the 



level of the particular source or pro- 
vider, such as the URL, are stored in 
the instance record. However, the exis- 
tence of the interface record reflects 
the fact that online journals are typi- 
cally acquired as part of a package that 
is licensed or purchased togedier and 
hosted on a common platform. The 
interface record often represents a 
provider, such as Wiley InterScience, 
but may sometimes represent instead 
a collection packaged by the provider, 
such as Wiley InterScience's chemis- 
try back files. While a patron viewing 
ORR at the serial-title level seldom 
cares where the content comes from, 
this information is essential to a range 
of management tasks, including collec- 
tion development and maintenance. 
The interface record makes providing 
an alternative view of the database pos- 
sible, thus supporting these tasks and 
enabling librarians to view and deal 
with, for example, all the SilverPlatter 
databases or each of the various JSTOR 
collections together. In addition to data 
elements identifying the provider and, 
where applicable, die collection, the 
interface record also includes other 



information relevant to those entities, 
such as an identifier referencing the 
provider in the TDNet data feed. Data 
specific to a given title offered by the 
provider, such as URLs and coverage 
dates, are stored in the instance record. 
A few data elements, such as status 
(i.e., availability), are found in both the 
instance and die interface record. In 
these cases, the instance record sup- 
plies a default value that may be over- 
ridden or augmented for a particular 
title. The provider- or collection-level 
view of die database has a counterpart 
in the public interface where users can 
obtain similar listings by choosing the 
provider or collection from a drop- 
down list on the search page for all 
resource types. The user interface also 
displays and links to any print holdings 
that are available for each title in the 
local online catalog. This feature, and 
the structural issues it raises, will be 
discussed later in diis paper. 

Several commercial products 
are comparable to ORR in purpose, 
design, and functionality and some 
are highly innovative. Ex Libris' SFX- 
based journal list, for example, real- 
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Figure 2. Record display in ORR's public interface 
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Figure 4. Title list in ORR's public interface 



izes ORR's original design objective of 
driving a journal list and link resolver 
from the same knowledge base. ORR, 
however, offers a number of features 
not generally found elsewhere. While 
most commercial journal finding lists 
now provide a display of available 
sources grouped by title, and several 
also offer a link to the online catalog, 
most do not yet have the ability to link 
between earlier and later titles, or to 
search by provider or collection. 

The FRBR Hierarchy 

Librarians are most familiar with FRBR 
in its role as a lens through which to 
scrutinize a cataloging code, such as 
AACR2. The task in designing ORR 
was not to codify a set of cataloging 
rules, but to establish a data structure 
together with a set of procedures for 
populating it. These two tasks have the 
same objective of facilitating access to 
resources by creating a coherent and 
lucid representation of them. 

The FRBR framework commends 
itself to the bibliographic management 
of online journals because of the over- 
lapping needs to relate content from 



various providers to a common work; 
to link related content across differ- 
ent platforms; to associate holdings 
in different formats; and to trace a 
publication's identity through succes- 
sive tide changes, splits, or mergers. A 
model articulating these relationships 
within a comprehensive framework 
holds promise for guiding decisions 
about the appropriate record structure 
and content, database schema, and 
display format for representing these 
resources. 

Most discussions of FRBR rela- 
tionships focus on the hierarchy of 
Group 1 entities: work, expression, 
manifestation, and item. This hierar- 
chy easily fits some aspects of ORR's 
design, even if not all of the entities 
and attributes implied by the latter are 
listed in the FRBR report's ontology. 
The resource record contains work- 
level data such as titles and subjects. 
The instance record corresponds to 
the manifestation, recording such attri- 
butes as the provider (which may be 
likened to a distributor for a print pub- 
lication), access address, and source 
for access authorization. 12 The presen- 
tation format of the text is determined 
at the level of the individual pro- 



vider each time it is viewed or printed, 
allowing for variations introduced by 
style sheets or other branding or cus- 
tomization features. Accordingly, each 
online viewing or printing may be con- 
sidered an item (partially) instantiating 
the manifestation. 

Discerning expression-level attri- 
butes in ORR is difficult. ORR elides 
attributes diat could be modeled as 
distinguishing characteristics of expres- 
sions, such as whether accompany- 
ing graphics are provided. 13 In effect, 
ORR assimilates all electronic versions 
to the one expression. In this respect, 
its practice closely resembles the 
Cooperative Online Serial (CONSER) 
program's aggregator-neutral record, 
which similarly elides expression- and 
manifestation-level data to collocate all 
online versions under a single record. 14 
Several recent analyses in the literature 
suggest that the appropriate treatment 
of expressions may be dependent on 
the nature of the works represented. 13 
The FRBR report's statement, that "on 
a practical level, the degree to which 
bibliographic distinctions are made 
between variant expressions of a work 
will depend to some extent on the 
nature of the work itself, and on the 
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anticipated needs of users," supports 
this view. 16 

Finally, how does ORR's inter- 
face record fit into the FRBR model? 
The interface represents an aggregate 
entity that exists at an intersecting 
plane to the main hierarchy. Such 
aggregates are not peculiar to continu- 
ing resources. Structurally, they are 
like certain types of aggregates found 
in die monographic domain, such as 
"bound together" titles and collect- 
ed editions, which share attributes at 
the item and the manifestation level 
respectively. Mimno, Crane, and Jones 
offer the following analysis of a col- 
lected edition: "At the work level, one 
play by Aeschylus is clearly a distinct 
entity from another play, but at the 
manifestation level, the publication 
information for every translated play in 
the volume is the same, and therefore 
should be kept in a single record." 1 ' 
They tentatively advocate linking a sin- 
gle manifestation-level record for the 
collected edition to multiple expres- 
sion-level records for the individual 
titles. The interface record plays an 
analogous role in ORR, recording in 
one place manifestation-level data drat 
apply to multiple serial works. 

Identifiers and FRBRization 

The ORR project combines aspects of 
two kinds of undertakings. It resem- 
bles certain FRBR implementations 
in that it populates its database by 
taking existing data and reconstructing 
them, via a predetermined algorithm, 
into a unified hierarchical structure. 
This is a process sometimes known as 
FRBRization. ORR also resembles a 
link resolver in that it is built around 
a massive consolidation of subscrip- 
tion- and holdings-related data, lev- 
eraged to facilitate effective linking 
and discovery among different con- 
tent providers. In some ways, the 
ORR project resembles a specialized 
serial counterpart of OCLC projects 
like OpenWorldCat, which combine 
the two foregoing strategies by first 



FRBRizing an existing data set and 
then using a database of holdings to 
identify available copies. 

In ORR, FRBRizing the ingested 
data organizes the links. Content from 
various providers is brought together 
under a single title. Links are supplied 
from die electronic versions to the 
print versions, and also between earlier 
and later titles. In one respect, the task 
of bringing together the relevant data 
is easier than with large-scale efforts, 
such as those undertaken by OCLC to 
FRBRize monograph records. Those 
projects rely on complex work keys 
such as author/title and author/uniform 
title combinations to create clusters of 
works, expressions, and manifestations 
with varying degrees of success. 18 By 
contrast, much of the desired cluster- 
ing of data in ORR can be achieved 
through the simpler process of match- 
ing ISSNs from different sources and 
mapping selected data elements into 
the relevant fields in ORR records. 
ISSN is widely used and is assigned 
at the right level of granularity to 
serve adequately as a work identifier 
in most situations, at least relative to 
a given language and physical format. 
The availability of ISSN as a work key 
is fortunate because the uniform title 
headings (MARC field 130) charac- 
teristic of serial records are designed 
to distinguish titles rather than to col- 
locate them, and are consequently of 
limited value as work keys. 19 

ISSN in its present form is far from 
ideal as a work identifier. Like other 
extant identifiers such as International 
Standard Book Numbers (ISBN) and 
Digital Object Identifiers (DOI), it 
addresses the need of publishers to 
identify distinct entities, but not the 
need of users to navigate between 
related ones. Knowing the ISSN of 
the print version does not help one 
find die online version, and vice versa, 
unless one has access to some kind of 
dictionary. This characteristic of identi- 
fiers is a consequence of the Principle 
of Functional Granularity, promulgat- 
ed by the Indecs e-commerce body, 
which states that "it should be possible 



to identify an entity whenever it needs 
to be distinguished." 20 The principle 
says nothing about identifying related 
entities. At the time of this writing, a 
proposal was before the International 
Organization for Standardization (ISO) 
to introduce a mandatory Medium- 
Neutral ISSN (MNI) into the ISSN 
standard. This measure, if adopted, 
may ameliorate the existing difficulties 
considerably. Even in its present form, 
ISSN enjoys the important advantage 
of being uniform across different pro- 
viders and through changes of pub- 
lisher, something not true of ISBNs 
or DOIs. Each ORR record has two 
ISSN fields: one for the online ver- 
sion and one for the electronic ver- 
sion. These two fields jointly suffice to 
identify the title for most purposes. To 
establish the correspondence between 
print and online ISSNs, having sources 
of data that can associate the two, such 
as MARC records with ISSNs in their 
022 and 776 fields, is valuable. 

Serial Work Relationships 

Seriality encompasses a much greater 
range of relationships than those of 
the FRBR Group 1 hierarchy. Serials 
change attributes such as titles, ISSNs, 
publishers, and physical format over 
time. Some changes give rise to new 
works or expressions bearing specific 
relationships to their immediate sib- 
lings or ancestors. Serials also break 
down into various kinds and levels of 
constituent subunits, such as issues, 
volumes, articles, indexes, and supple- 
ments. Some of these relationships 
are outlined in the FRBR report, 
including "successor" and "supple- 
ment" relationships between works, 
and whole-part relationships between 
serial works and their constituents. 21 
These relationships define aggregates, 
and a comprehensive dieory of serial 
aggregates would do much to put the 
design of serials-management systems 
on a sounder footing. Aggregates are a 
relatively undeveloped area in FRBR, 
receiving barely a page of direct dis- 
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cussion in the final report. Some prog- 
ress has been made since the report's 
publication. A FRBR working group on 
aggregates now exists, and members of 
the FRBR community are develop- 
ing general taxonomies of aggregates 
and their properties. For example, 
Albertsen and van Nuys identify a set 
of aggregate classes among which are 
several that are applicable to continu- 
ing resources: the "extension" class, 
which subsumes successively issued 
resources, including most convention- 
al serial publications, the "update" 
class, which roughly corresponds to 
the notion of an integrating resource, 
and the "variant" class, which encom- 
passes alternative versions of a publi- 
cation. 22 In its present state, however, 
FRBR offers only limited guidance to 
the developers of a tool like ORR. 

The question arises as to whether 
the aggregate is itself another work — a 
"super-work" — or indeed whether it is 
only the aggregate that may properly 
be identified as the work. Shadle advo- 
cates the latter position, arguing that 
the serial work should not necessarily 
be identified with any one record and, 
unless there is a merger or a split, the 
serial work should be considered to 
persist. 23 Although Shadle's position 
has strong intuitive appeal, Delsey's 
position, which allows the boundaries 
between works to be drawn by the pre- 
vailing cataloging code, can suffice. 24 
Until a theory of aggregates is more 
fully developed, the position taken on 
this issue is not critical. The structure 
of ORR is compatible with both posi- 
tions, and Delsey's approach has the 
advantage of simplifying the ontol- 
ogy. From a practical viewpoint, the 
important thing is less which title-level 
entities are called works and which 
are called manifestations, but more 
how well the relationships between 
the entities are captured. For the 
same reason, referring to aggregates 
as "super-works" is not crucial at this 
juncture, so long as works standing in 
specific relationships may form aggre- 
gates with definable properties. 



The most important relationship 
that ORR must deal with is the succes- 
sor relationship, or what serials librar- 
ians call title changes. This issue, and 
the related question of how ORR han- 
dles print holdings, highlights some 
unresolved issues with ORR's current 
data structure. 

ORR follows the AACR2 practice 
of successive entry cataloging. Each 
title change (or rather, each major title 
change) in a publication's history trig- 
gers the creation of a new record rep- 
resenting a related but distinct work. 
Each record contains links to the 
records for its predecessor and suc- 
cessor, but no structure represents the 
complete title history. In this respect, 
ORR exactly replicates the type of 
structure in AACR2 catalogs. It also 
inherits one of the weaknesses of such 
catalogs, namely the fact that one's 
ability to reconstruct a complete title 
history is contingent upon the library 
owning a sufficiently unbroken run of 
holdings for that publication. If a serial 
publication has the title history SI, S2, 
S3, and the library owns issues of SI 
and S3 but not S2, the bibliograph- 
ic data in its catalog will not allow 
users to connect SI with S3. This is 
the "missing link" problem. Although 
some feel that a full title history is not 
always desirable, it is invaluable in a 
distributed environment where com- 
plementary coverage may be available 
from different sources. 2,5 

This structural shortcoming over- 
laps with the problem of representing 
different formats. The FRBR report 
suggests that alternative formats are 
to be represented at the manifestation 
level. 26 That approach is not taken 
in ORR. The visual cues in the pub- 
lic display present any print holdings 
that are available, not as one version 
among others, but rather as a link to 
the online catalog. The display reflects 
the database schema, which locates 
print holdings data (as well as the ILS 
record identifier used to generate the 
link to the online catalog) not in the 
instance but in the resource record. 



The differing treatment given to 
electronic and print formats can partly 
be explained by ORR's design objec- 
tives. The primary purpose of ORR 
is to represent available online con- 
tent. For most users, the catalog link 
exists to provide a fallback should 
the desired full-text content not be 
available online — for example, if the 
issue sought predates the available 
back files. Accordingly, the instance 
record is optimized for online content. 
The library catalog remains the main 
source of information about print 
holdings and, rather than attempt to 
replicate its content in detail, ORR 
simply links to it. 

This approach, however, equiv- 
ocates between works and larger 
aggregates. The equivocation is often 
evident in the holdings data displayed 
in conjunction with the links to print 
and microform records in the online 
catalog. In some cases, holdings data 
are displayed for the specific jour- 
nal title; in others, holdings data are 
displayed for the entire title history, 
including titles predating any available 
online content. In other words, the 
holdings data displayed in some cases 
represent another manifestation of the 
same work and in others represent a 
larger aggregate including that work 
and others. This inconsistency is partly 
the result of historical UIUC serials 
cataloging practice, which for a time 
followed latest entry, but it also reflects 
an unresolved tension in ORR's treat- 
ment of serial aggregates. 

Locating the link at the work 
level does not solve the missing link 
problem. The problem arises in a 
particularly acute form in this setting. 
Returning to the example of a title his- 
tory SI, S2, S3, consider a case where 
the SI and S2 are issued in print, but 
the journal moves to an online-only 
format with S3. The ORR entry will 
naturally be for S3. In such a case, 
no print equivalent exists to which 
the ORR record can link. The same 
problem arises where a print version 
continues to be issued but the library 
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cancels its print subscription in favor 
of online access before a change of 
title. For example, the UIUC library 
cancelled its print subscription to 
Archives of Otolaryngology in 1975, 
but later regained access to this journal 
via an online subscription with cover- 
age beginning in 1995. In the mean- 
time, the journal had changed its title 
to Archives of Otolaryngology — Head 
and Neck Surgery, with a new ISSN. 
Again, the link between electronic and 
print holdings is lost. The problem 
compounds over time as each succes- 
sive title change puts further distance 
between the latest online incarnation 
and its print predecessors. Until now, 
this problem has arisen only with a 
small number of titles, but ORR's 
developers will need to address it as 
more journals move toward online- 
only access. 

The UIUC catalog uses a single 
record to represent print and online 
versions of a title. The problem would 
take a somewhat different form in 
catalogs that use multiple records. At 
UIUC, using the same record iden- 
tifier to reference the bibliographic 
description for the work and to link 
to the print holdings is possible. Had 
UIUC used multiple records, the 
issues raised by aggregates would have 
been confronted at a much earlier 
stage of ORR's development. A library 
using multiple records would need to 
define explicitly an alternate relation- 
ship (defined in section 5.3.4 of the 
FRBR report) between the manifesta- 
tions represented by the two records 
and, presumably, to enter both record 
identifiers in the database. 



From Relationships to Families 

Rules for title changes in the past may 
have been too strict. The cataloging 
rales may not adequately capture the 
notion that a work may persist through 
changes, even major changes, in title. 
The long-running debate in the seri- 
als cataloging community between 



successive and latest entry cataloging 
may reflect conflicting views about 
the identity of a serial work over time. 
Shadle's position on serial aggregates 
similarly gives expression to die desire 
to capture the nature of the serial work 
as a persisting entity. 2 ' 

Why is there no record structure 
in ORR representing the aggregate's 
title history? A convenient source has 
yet to be found for the required data. 
To remain complete and current, ORR 
depends on external data sources and 
could not otherwise exist on its pres- 
ent scale. The same dependence also 
means that ORR is constrained by the 
quality of the available data, and by 
the data structure of the source. As 
with identifiers, ORR to some degree 
inherits the characteristics and under- 
lying assumptions of existing standards. 
Had latest-entry rather than succes- 
sive-entry cataloging been the norm, 
extracting the complete title history 
from the MARC record in hand would 
have been relatively easy. 

A number of proposals coalesce to 
suggest a way forward. A 1993 study 
by Alan showed that more dian 70 
percent of "title-change record sets" 
within a sample of CONSER-authen- 
ticated MARC records were linked 
together by a combination of ISSNs, 
LC classification numbers (LCCN), 
and OCLC numbers in the 780 and 
785 linking entry fields. 2S Antelman 
suggests that the same data could 
be used as the basis of a work-set 
algorithm that would create "biblio- 
graphic families" showing relation- 
ships between works. 29 

Tillett has advocated the use of 
authority records to show relation- 
ships among bibliographic entities, 
and Rosenberg and Hillman have 
proposed a structure for doing so 
with serial works. 30 Building author- 
ity structures based on data harvested 
using a strategy similar to Antelman's 
may be possible. Ideally, this author- 
ity file would be a large-scale shared 
enterprise, but even a local project 
within the limited context of ORR may 



be feasible. These authority records 
would record data — especially identifi- 
ers like ISSNs — relating to alternative 
formats, title changes, merges, splits, 
and other relationships. This approach 
would differ from the existing strategy 
used in ORR for linking title changes in 
that it would encompass a wider range 
of relationships and would allow all 
relationships to be shown to the user, 
overcoming the missing link prob- 
lems. The same data would have other 
potential applications. It could be used 
to effect linkages between catalogs in 
a shared environment, for example, or 
to enhance link resolution. 

The work-set algorithm suggested 
by Alan and Antelman could be sup- 
plemented by other sources of data, 
such as MARC 776 additional physical- 
form information and a subscription to 
the ISSN register. A proposed devel- 
opment by the ISSN International 
Centre promises an alternative model 
for implementing an authority struc- 
ture. 31 The plan is to implement the 
ISSN database as a lookup and reso- 
lution service. A service of this kind 
would make possible the building of 
extremely powerful and flexible tools 
for discovering and accessing seri- 
al publications, and would allow the 
developers of systems such as ORR to 
overcome many current obstacles. 

Given the pace of change in the 
current environment and the vagaries 
of journal publishing, a service resem- 
bling one of those outlined previously 
in this paper already may have been 
developed by the time this paper is 
published. 

Sources and Targets: 
Other Issues 

In ORR's distributed environment, 
many ouher issues arise with both the 
quality of the available data and with 
the characteristics of the resources 
to which ORR provides access. Data 
sources present particular problems. 
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• The TDNet data feed does 
not have a separate field for 
tracking title changes, instead 
giving this information in free 
text within the title field. Title 
changes have to be caught by 
library staff members, who then 
create a new record manually. 

• Ulrich's Periodicals Directory, 
although it indicates if an elec- 
tronic version is available, does 
not always provide die corre- 
sponding electronic ISSN. This 
information has to be supplied 
from other data feeds, or else 
by a human operator. 

• The UIUC catalog uses the 
single record approach to rep- 
resent the print and electronic 
versions of each journal. This 
approach can result in the omis- 
sion of electronic ISSNs neces- 
sary for matching and linking. 
Best practice is to include 776 
fields providing the ISSN in 
subfield x and other identifi- 
ers (OCLC number, LCCN) in 
subfield w. 

• A single consolidated statement 
of print holdings is not usu- 
ally available. Instead, the cata- 
log breaks down the holdings 
for print copies by the vari- 
ous library locations. As already 
noted, the practice of successive 
entry is another obstacle to die 
provision of a single summary 
of holdings. ORR's print hold- 
ings field was initially populated 
partly with summary holdings 
data fortuitously available from 
another, unconnected project, 
but a different solution will 
need to be found for the longer 
term. In the future, data may 
be parsed from the 866 field of 
MARC-holdings records. 

The targets to which ORR links 
can present a further layer of struc- 
tural complexity. Just as each source 
of data has its own structure that must 



be mapped into ORR, each provid- 
er's manifestation of a title has its 
own implicit structure for presenting 
the constituent units of each work or 
group of works. Most examples fall 
into one of following categories: 

• The entire history of a journal is 
entered on a single page under 
its current title alone. Earlier 
titles are not given, unless they 
happen to be reproduced on 
the scanned pages of the earlier 
issues themselves. An exam- 
ple, cited by Jones, is Online 
Information Review, which 
does not appear anywhere on 
the Emerald site under its ear- 
lier title, Online and CD-ROM 
Review, even though some of the 
issues available on the site were 
originally published under that 
tide. 32 Because individual titles 
are searchable within ORR, it 
provides better title-level access 
than the vendor's own site. This 
is a decided advantage, since 
journal articles are cited using 
the title of the journal at the 
time of publication. 

• All titles are accessed via a 
single page, with prominence 
given to the latest or current 
tide. Individual titles are listed 
with their respective publica- 
tion dates, but it may not be 
possible to retrieve them by a 
search within the native inter- 
face. Examples of providers 
following this format include 
Springer and the Royal Society 
of Chemistry. Again, title-level 
access is better in ORR than 
through the vendor's site, but 
with die further advantage, at 
least in the examples given, that 
a link to a page representing 
each distinct title is possible. 

• Each title in the sequence 
is entered separately on its 
own page, widi links provid- 
ed between them. This is the 



most common arrangement, 
and most closely reflects suc- 
cessive entry practice. EBSCO, 
JSTOR, and many others follow 
this approach. In these cases, 
the ORR record for each title 
simply links to the correspond- 
ing page. 
• No title-level page is given and 
content is available only by 
searching for articles by means 
of a search form. An example 
is OCLC FirstSearch, for its 
Wilson Select Plus collection. 
In these cases, ORR shows the 
user an icon indicating that a 
further search will be required 
after linking to the vendor page. 
Whether the icon is displayed 
is determined by a field called 
"AutoLinkLevel" in the inter- 
face record. This field indicates 
whether the link points to a 
page for the title or whether a 
further search will be required. 

Conclusion 

This paper has presented a case study 
of the cataloging issues involved in 
the creation of an online journal find- 
ing lists and serials management sys- 
tem. Although an a posteriori analysis, 
FRBR concepts are strongly applicable 
to this project. The Group 1 hierar- 
chy is an obvious model for organiz- 
ing content from different providers, 
while the application of the larger 
FRBR framework to serial relation- 
ships raises important issues regard- 
ing aggregates. The discussion also 
touched briefly on Group 2 and Group 
3 entities — content providers and sub- 
jects respectively. 

In hindsight, conducting a FRBR 
analysis in the early stages of the ORR 
project would have been advisable. 
Such an analysis might have helped to 
clarify some of the issues that emerged 
during ORR's development, especially 
the treatment of title histories and 
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print holdings. However, although the 
FRBR model provides a framework 
for conceptualizing the problems, it 
does not, at present, offer a complete 
blueprint for a solution. The challenge 
of applying FRBR to a serials system 
raises as many questions for the inter- 
pretation and future development of 
the FRBR model as it does for the 
design of the serials system itself. 

This study suggests a number of 
possible topics for further consider- 
ation. The FRBR approach of relat- 
ing user tasks to entity relationships 
may help to clarify what is needed 
to build interoperable services in a 
distributed environment. One poten- 
tial line of inquiry, hinted at but not 
pursued in any depth here, is how 
FRBR may help to model algorithms 
for link resolution. Much of the effort 
in this project went into mapping 
data from outside sources into ORR. 
FRBR analysis should help rational- 
ize the consolidation of data from 
various sources by ensuring that they 
map to entities at the right level of the 
FRBR hierarchy. More fundamentally, 
FRBR should be helpful in guiding the 
design of database structures for seri- 
als-management systems. 

The emphasis of this paper has 
been largely conceptual. The creation 
of ORR has been, above all, a practi- 
cal matter, and many aspects of its 
development are amenable to empiri- 
cal study. This author and his UIUC 
colleagues hope to publish a more 
detailed examination of the process 
of populating the database and its 
outcomes. 
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Notes on Operations 

Unking Print and Electronic Books 

One Approach 

By Betsy Simpson, Jimmie Lundgren, and Tatiana Barr 

Library catalog searchers expect to retrieve information for all resources in the 
catalog that matches their search strategy. They expect keyword searching to 
retrieve a rich array of resources. In an effort to enhance service to users, the 
University of Florida Smathers Libraries acquired table of contents data to enrich 
bibliographic records for print books with publication dates from 1990 to the pres- 
ent. Many of these books have also been acquired in electronic format. Because the 
record for the same book in electronic format did not include the enhancements, 
catalog users were likely to retrieve the catalog record for the print version only 
and remain unaware of the availability of the electronic version. The authors, 
using insights from discussions surrounding the Functional Requirements for 
Bibliographic Records (FRBR) initiative, developed a method for serving users 
more effectively by linking these records to leverage the enhancements for both 
versions (two manifestations) of the same title. 1 
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Introduction 

The proliferation of electronic 
resources and increasing expen- 
ditures on electronic resources have 
had a profound effect on library ser- 
vices. Librarians have been forced 
to rethink assumptions about basic 
library operations as well as long-held 
notions about user needs and behav- 
ior. Within the cataloging commu- 
nity, die Functional Requirements for 
Bibliographic Records (FRBR) has fos- 
tered a renewed commitment to creat- 
ing library catalogs that allow users to 
find, identify, select, and obtain library 
material, and to navigate through the 
catalog database more effectively. 
FRBR inspired the authors to look 
for ways to improve the link between 
library catalog records for correspond- 
ing print and electronic book (e- 
book) titles. The impetus for creating 
these links was the presence of table 
of contents (TOC) data only in the 
records for print materials. Searches 
by keyword in die online catalog that 
matched data in the TOC retrieved 
only die records for print books. As 



e-books become more accepted as 
alternatives and supplements to their 
print equivalents, users will benefit 
from efforts to enhance access to them 
through the library catalog. 

Background 

A major thrust of current national 
cataloging initiatives is toward improv- 
ing the display of connections and 
relationships among bibliographic 
entities. The FRBR conceptual model 
promotes a framework that high- 
lights the interrelatedness of works 
and allows users to navigate eas- 
ily among expressions, manifestations, 
and items. 2 Embracing the underly- 
ing tenets of FRBR, libraries have 
been motivated to explore changes 
that leverage bibliographic data in new 
ways. The Research Libraries Group's 
(RLG) RedLightGreen service was an 
early, large-scale, innovative applica- 
tion of FRBR principles. Launched 
in 2003 (and ended November 1, 
2006), RedLightGreen sought to mine 
RLGs union catalog for "conceptu- 
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al relationships and holdings data." 3 
OCLC's Fiction Finder also employs 
automated means to find and collocate 
material in order to present users with 
a clear summary of various editions 
of fictional works and which libraries 
own them. 4 Advances in data harvest- 
ing as well as semantic interoperability 
hold promise for dramatically improv- 
ing user interfaces. 

Three obstacles stand in the way 
of libraries fully FRBRizing catalogs. 
First, typical library management sys- 
tems do not adequately manipulate 
the links that currently exist among 
bibliographic records. Referring to 
the inherent problems associated with 
record linking within the library envi- 
ronment, Gradmann speaks of the 
"library automation applications and 
the data architecture underlying these, 
which strangle libraries, creating a 
structural lack of technical flexibility" 
and discusses the need to "free librar- 
ian bibliographic data from its golden 
catalogue-cage." 5 Yee also decries the 
state of library catalog software, stat- 
ing that libraries are forced to "choose 
among undesirable alternatives" when 
seeking a system that has an adequate 
search engine with helpful displays. 6 
Strong words, but justified given 
that much data created by librarians 
sits essentially unused due to system 
limitations. 

Second, catalogs lack the data 
necessary to reflect relationships 
because catalogers frequently have 
not provided it. Catalogers have not 
entered all data necessary to reflect 
relationships due, in part, to increas- 
ing pressure for catalog departments 
to economize operations and improve 
throughput. This has resulted in work- 
flows in which downloaded copy is 
accepted as is, and original records are 
created with less detail. 

The third deterrent to more thor- 
ough recording of relationship data is 
that library management systems are 
not able to make use of all data. Yee 
notes that the trend toward deprofes- 
sionalization has reduced the num- 
ber of highly trained, knowledgeable 



catalogers, and suggests that this void 
leaves the profession without a voice 
that understands the nuances of bibli- 
ographic description and can advocate 
successfully for change. ' 

Links between records may be 
missing entirely, not consistently 
applied, or entered in a way that is dif- 
ficult to extract. The labor that would 
be required to create links manu- 
ally on a record by record basis seems 
unthinkable in an age of shrinking 
budgets and staffs. Bowen discusses 
the possibilities for cataloger-created 
collocation, but acknowledges that the 
additional effort is likely prohibitive 
and will necessitate selective adop- 
tion. 8 Catalogers, however, using die 
technological tools at hand and work- 
ing closely with systems professionals, 
public service librarians, and vendors, 
can develop more ways to provide and 
utilize relationship data, and play a key 
role in making library collections more 
accessible to users. 

What should catalogers do during 
this time of transition when devel- 
oping a FRBR-like catalog portends 
a significant outlay of time, money, 
and technical expertise? The authors 
suggest that catalogers can begin by 
taking whatever small steps are pos- 
sible while also collaborating with ven- 
dors and colleagues to institute more 
sweeping changes. Librarians can fol- 
low Bowen 's recommendation to "look 
for opportunities to implement some 
aspects of the FRBR model within 
other activities that are more under 
the library's immediate control." 9 In 
this spirit, the authors approached a 
local experiment to link e-book records 
to their print equivalents. A literature 
search did not produce evidence of 
similar projects elsewhere. 

NetLibrary, a division of OCLC, 
is a leading provider of e-content. 
It is one of a growing number of 
companies offering access to e-books 
and, often, the corresponding files 
of MARC bibliographic records for 
downloading to local library cata- 
logs. 10 Since 2001, the University of 
Florida Smathers Libraries (UFL) has 



batchloaded approximately 250,000 e- 
book records from NetLibrary, Early 
English Books Online, Eighteenth 
Century Collections Online, Histoiy 
e-Books, Past Masters, Gale Virtual 
Reference Library, and Books24x7. 
With this level of activity, UFL wanted 
to maximize its investments by improv- 
ing the ease with which users can find 
these resources. Other libraries are 
likely experiencing similar needs. The 
library catalog can provide a solution 
by alerting users when the electronic 
version of a title is available along with 
the print version. 

UFL, like many other libraries, 
has loaded separate catalog records for 
e-books rather than attempting to uti- 
lize a single-record method of access. 
As a result, users often are presented 
with multiple entries for the same 
title, which they must examine individ- 
ually to discover the alternative format 
option. Since MARC e-book records 
usually replicate their corresponding 
print records in key access points, 
"browse" catalog searches retrieve 
both formats next to each other in the 
index. The TOC enhancements, which 
also provide access through browse 
searches to chapter titles for UFLs 
print books, do not retrieve the cor- 
responding e-books because they lack 
the same TOC data. If a user, through 
any search option, retrieves a record 
for the print version, the burden is on 
him or her to go back to the catalog 
index to note the existence of a record 
for the e-book, and vice versa. Often 
that index (because of the search 
strategy used) does not include the 
other record at all and calls for a new 
search (not based on TOC enhance- 
ment data) to determine whether or 
not another version is available. Figure 
1 illustrates the problem. UFL owns 
both the print and e-book versions 
of Social Cognition: Making Sense of 
People. When uhe term "hot cognition" 
is searched, only the print version 
record is retrieved because that term 
is present in the TOC. 

To better serve UFL users, the 
authors sought a means to improve 
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searching. Although the authors' proj- 
ect led to the use of a system-specific 
field not widely available at other insti- 
tutions, this paper seeks to highlight 
a creative and collaborative approach 
to a linking problem that resulted in a 
solution — an approach that might be 
adopted by others coping with imper- 
fect user interfaces. 



Linking Project 

Because the authors recognized that 
UFL holds many titles in both print 
and electronic forms and, through 
FRBR, had a heightened awareness 
of the value of relationships in the 
catalog, they wanted to find a good 
way to connect users to the two mani- 
festations represented by the print 
and e-book versions of the same title. 
In describing future practice in "Draft 
Interim Guidelines for Cataloging 
Electronic Resources," the Library 
of Congress (LC) made a distinc- 
tion between the collocating function 
and the linking function. 11 LC advised 
using uhe added entry technique for 
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collocating and the linking function 
when appropriate. The 776 linking 
field (additional physical form entry) 
was designated to represent horizontal 
relationships. In 1999 Florida's State 
University Libraries, including UFL, 
required the use of the 776 field in 
its "Access and Cataloging Guidelines 
for Electronic Resources" for its coop- 
erative digitization program. 12 This 
proved advantageous when LC want- 
ed to know which of UFLs holdings 
had been digitized. These were identi- 
fiable through the 776 field in catalog 
records. Critical to the success of the 
project described here is the fact that 
NetLibrary distributes catalog records 
for e-books that include 776 link- 
ing fields with both the Library of 
Congress Control Number (LCCN) 
and the OCLC number for the print 
version. Unfortunately, not all e-book 
vendors do. 

Initially, the authors explored 
making the 776 field a functional link- 
ing field in the catalog as are 780 
(preceding entry) and 785 (succeeding 
entry) fields in the UFL catalog. For 
those fields, a hidden search is trig- 



J | HIUFWHI | .III VtHII .mrl.tjtj 



Ytu Jra MircTfing: - Lf 1 LiBruy Caetttf 



wSwti^tpJW: J »Jnw iTn- .Vfcv 



lint: 



f ' fi 

ftav. 
Idc-H™ 

lint: 

■ IMC 
•..■i.... 

Foiniflr 



l«MIIV»ll«T-ll,OTI»l»»»«lm I«WIMI1.I—' wim SCI Mil 
Kr. <tifciS nwitrTPli*! H!4Anlh|(irNt 




K(. IKdilCtlrfcl 



WVI.Hm^ .rwii rrrmr*** hw( l-» -t^-.rwn.- -JWTNr. MT»™r. and tip pnn W rfrlani*.- 
lirpmy > 

IM 

tqucihrKfi unfjn ■- nc^ttf -bj ana lull iiih:L' Lm-i 

NOOlfaltirailCkilrfct .-'jljLla llltArtLAd 



I WW** (/F1HMI I mnlui I zvmtn nraircr LlyaiK I 5*v*KH"L«'Hi*:illBn)l 



Figure 1. Search results for "hot cognition" 



gered by clicking on the linking field, 
which leads the user to a results list 
that includes the related serial record. 
Such indirect linking or "pseudo- 
hyperlinking" is described in detail in a 
2005 report issued by the Task Group 
on Linking Entries of the Program 
for Cooperative Cataloging (PCC) 
Standing Committee on Automation. 13 
This option, while definitely valuable, 
is less than ideal because it might 
lead to no results or force the user to 
select the related record from an often 
ambiguous display list. Many libraries 
use this kind of solution for connecting 
serial records based on eiuher title or 
ISSN (International Standard Serial 
Number) searches, although not as 
many as the authors expected. 

The authors conducted a small, 
informal analysis in spring 2006 to 
determine how large academic librar- 
ies make use of the earlier and later 
serials titles linking fields. The authors 
searched ten serial titles in the cata- 
logs of twenty-one libraries randomly 
selected from uhose represented in the 
Association for Library Collections & 
Technical Services' Technical Services 
Directors of Large Research 
Libraries discussion group. They 
observed the options provided 
to facilitate users' ability to con- 
nect between the earlier title 
(780 field) and later title (785 
field) records. Results fell into 
four categories (see table 1). In 
some cases, the presence of mul- 
tiple 78x fields apparently pre- 
vented the clicking function of 
both fields. This exploratory sur- 
vey, while limited in scope, did 
uncover typical patterns of ser- 
vice for this function in at least 
five integrated library manage- 
ment systems. More extensive 
research along these lines could 
be both interesting and useful. 
Recording earlier and later titles 
of bibliographic records for seri- 
als appears to be the most con- 
sistent practice; other methods 
for connecting users to related 
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Table 1. Use of linking fields in online catalogs (N=21) 



Presence of Link Description of Functionality 



No clickable link 



Display of journal record does not include 
clickable link to earlier or later title. 



Libraries 

Duke University, Harvard University, Indiana 
University, Princeton University, University of 
Chicago, University of Minnesota, University of 
Pennsylvania, University of Texas at Austin 



% of total 

38 



Clickable link to Display of journal record includes clickable link for University of California at Berkeley, University 10 

search screen earlier or later title, and clicking leads to search screen of Michigan 

with browse and keyword search options for related title. 

Clicking there leads to results list for chosen search. 



Clickable link to Display of journal record includes clickable link for 
results list only earlier or later title, and clicking leads to results list for 
journal title search for related title. 



Cornell University, New York University, 
Pennsylvania State University, Stanford University, 
University of California at Los Angeles, University of 
Illinois at Urbana-Champaign, University of Virginia, 
University of Wisconsin at Madison, Yale University 



42 



Clickable link to Display of journal record includes a clickable link 
related record or that leads either directly to the related record or to a 
results lists results list when the entry is not unique. 



Ohio State University, University of Washington 



10 



titles appear lacking. This falls short of 
the ideal of providing complete infor- 
mation for catalog searchers as they 
seek to identify and locate relevant 
resources. 

The Florida Center for Library 
Automation (FCLA), which provides 
automation services to the libraries 
of Florida's publicly funded universi- 
ties, was unable to identify any type 
of similar capability for 776 fields 
in UFL's ALEPH integrated library 
management system (implemented in 
May 2004). ALEPH, however, offers a 
non-MARC field that can be used to 
connect records directly. This ALEPH 
system-specific field allows direct 
functional connections among biblio- 
graphic records, holdings records, and 
item records. It is useful for connect- 
ing bound-together titles, analytics for 
collection or set level records, and 
(as the authors discovered) has other 
interesting possibilities. The PCC Task 
Group on Linking Entries discussed 
the benefits of creating such logi- 
cal links, even suggesting a possible 
7XX subfield utilizing local system 



or standardized numbers, although 
the group voiced concern about the 
limited availability of data in catalogs 
to support linking and the lack of 
cataloging staff to enter it. 14 The group 
focused primarily on complex linking 
among serial records rather than on 
the straightforward one-to-one rela- 
tionship between equivalent print and 
e-book records. 

Use of a local non-MARC field 
could be called "guerilla cataloging" 
for several reasons. While one can 
enter and save the field in the local 
system, this cannot be done when 
cataloging in OCLC. It may or not be 
retained upon migration to another 
system. However, the authors hope 
that the greater awareness of the value 
of bibliographic relationships, which 
has been highlighted by FRBR discus- 
sions, will result in improved online 
catalog systems that will continue to 
use this data. 

Because this field is system-spe- 
cific to ALEPH, it is not addressed in 
AACR2, MARC 21, or OCLC cata- 
loging and coding rules. This leaves 



local catalogers without guidance as 
to how and when to utilize new tools 
such as this, and (at UFL) is lead- 
ing to open-ended discussions among 
public and technical services staff. 
The UFL Cataloging and Metadata 
Department charged a committee to 
evaluate local use of this field. The 
Task Force focused chiefly on serials 
and special collections materials, but 
staff are encouraged to explore other 
possibilities. When used for parallel 
bibliographic records, this field gener- 
ates a reciprocal note about the other 
form available in the public displays of 
both matched records. These can be 
clicked to directly connect a user to a 
matched record. No intervening index 
displays as it does with the 780 and 
785 field linking. 

Worthy of note is the way in 
which UFL leveraged investment in 
the TOC records. The TOC enhance- 
ments were acquired, in effect, as 
a "two for the price of one" bargain 
because catalog users retrieve both 
records (for the print and electronic 
versions) in searches even though the 
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TOC are loaded only in die records for 
the print version. 



Implementation 

Project implementation was possible 
without costly investment of staff time 
to implement the changes and sustain 
the new steps for linking records. 
UFL's technical coordinator designed a 
highly automated, multi-step process. 
The first step was to identify the pairs 
of records for alternate versions. This 
was done by generating system reports 
of the NetLibrary records with the 
LCCN in 776 subfield w (record con- 
trol number). A system-specific loader 
software (GenLoad) available from 
FCLA was run to find any matching 
print version records using an LCCN 
search. The system numbers were 
extracted and merged into an Excel 
spreadsheet. A macro was then used 
to insert the local field in one record 
of each pair. While either record could 
contain the local field, the authors 
decided to place it in the NetLibrary 
record because the library was still 
in the process of adding the tables of 
contents to the print version records 
and did not want to risk overlaying the 
linking data. The flowchart in figure 2 
shows the steps involved. 

After the field is inserted in one of 
a pair of records, a user who retrieves 
one of the records is both informed 
about the related item and supported 
in easily connecting to the other record 
by a simple click. Thus, a remote 
user who retrieves the record for the 
print version will easily benefit from 
connecting to the electronic version, 
which can be viewed without leaving 
the computer. Similarly, a user who 
first retrieves the e-book record but 
desires to borrow and use a print book 
will be connected to the record giving 
shelf location and availability. Figures 
3 and 4 show the OPAC views for the 
linked records. 

This process, in addition to the 
advantage of using little staff time, 



t" 



Extract electronic 
system number 
and title 



/Electronic 
system 

| number is , 
retrieved I 



can be applied 
retrospectively to 
records that may 
have been in the 
catalog for years. 
While UFL has 
already begun con- 
necting new bound- 
together items and 
analyzed sets of 
various kinds, there 
is also interest in 
identifying other 
methods for better 
presenting other 
related materials to 
users. The process 
of identifying the 
record pairs using 
the LCCN in the 
776 field may be 
replaced, at some 
point, by a more 
sophisticated pro- 
cess diat could use 
author-title com- 
binations to relate 
many of the varied 
formats and edi- 
tions held by the 
library. Such meth- 
ods already are 
being used else- 
where by the larger-scale FRBR proj- 
ects and library catalog systems to 
process existing catalog data to enable 
new indexing and display options. The 
audiors observed that Duke University 
makes use of a feature in their catalog 
that creates an author-title entry for 
each record. When clicked, die author- 
title entry opens a window populated 
with matching records, although ver- 
sions are not differentiated. While not 
quite as intuitive as a clickable note on 
the record that says "Available in other 
form: E-book" and links directly to the 
corresponding record, the author-title 
link allows the user to navigate among 
different versions. The authors encour- 
age others to explore creative solu- 
tions that will overcome the absence 
of data (for example, uniform titles) 




Extract full 
electronic 
record 



The 010 
rs the print LCCN. 
this the match point. 



Run process lo switch 
LCCN tag in electronic 
record from 010 $Sz 
lo010$Sa 



Print 
system 



/ number is / 
th!' *vhu : 




Run process 
using GenLoad 
to match print 
and electronic 
version 



Electronic and print 
system numbers are 
migrated and merged 
into Excel 
spreadsheet 



Macro retrieves 
electronic record in 
catalog and inserts link 
to print version 



Figure 2. Flowchart for creating link between print and e-book 



that might have facilitated navigation 
among different versions, but which 
were not added to records for cost 
reasons in the past. 



Conclusion 

Linking between NetLibrary and 
print version records has provided 
an exciting new way to connect users 
to materials they need in the format 
they prefer. It extends the benefits of 
TOC enhancements that were only 
in the records for print versions to 
the records for matching electronic 
versions. It does so in a way that 
facilitates better user awareness of 
and connection between the two ver- 
sions. Additional access thus gained 
includes browse searches for authors 
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Figure 4. OPAC view of e-book version 



and titles of chapters, and key- 
word access to terms included in 
the TOC fields of those records. 
This project enables users to 
access their preferred format 
for a given content regardless 
of the retrieval method and, in 
so doing, promotes what Tillett 
refers to as the fifth function of 
the catalog — to support naviga- 
tion. 15 Better navigation options 
are necessary if libraries are to 
satisfy users' growing needs and 
expectations. While the authors' 
preferred method for linking was, 
and continues to be, through the 
established MARC standard link- 
ing fields, in the absence of that 
possibility, an alternative process 
served the immediate need. 

This exploration of system- 
specific functionality had many 
other beneficial outcomes. The 
authors identified a function of 
the bibliographic record, thought 
in terms of the user's tasks, and 
then translated it into reality. This 
process led to a deeper under- 
standing of the uses of linking 
fields and how they can be 
incorporated into local workflow 
and thinking. The process itself 
expanded awareness and knowl- 
edge of the many issues that 
arise when librarians today try 
to offer users specific improve- 
ments to navigational and display 
capabilities in their catalogs. It 
was a learning experience that 
brought a refined understand- 
ing of FRBR terms. The cata- 
logers' collaboration with FCLA 
and UFL's technical coordinator 
that made this project possible 
benefited from using the FRBR 
model as a conceptual tool — one 
that enables various partners in 
the world of online information 
to communicate with each other 
effectively. The project made 
clear that today's catalogers must 
go beyond their traditional func- 
tions, explore new options in 
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technology, and communicate their 
ideas to those who can implement 
them and to those who benefit from 
the outcome. It highlighted the need 
for more consistent and coordinated 
practices by the cataloging commu- 
nity, system vendors, and suppliers of 
cataloging services. The next stage of 
the project will be to speak with public 
services staff at UFL about moving the 
linking note to a higher level of visibili- 
ty near the top of the record and in the 
brief view. Collaborating across library 
divisions and with systems designers to 
improve the navigability of the catalog 
for the users is in the best tradition of 
cataloging and represents the greatest 
hope for realizing the spirit of FRBK 

References 

1. Standing Committee of the IFLA 
Section on Cataloguing, Functional 
Requirements for Bibliographic 
Records: Final Report (Munich: K. G. 
Saur, 1998). 

2. Barbara B. Tillett, "FBBB and 
Cataloging for the Future," Cataloging 
& Classification Quarterly 39, no. 3/4 
(2005): 197-205. 

3. "BedLightGreen Will Enhance Stud- 



ent Besearch," RLG News 57 (Fall 
2003): 1. 

4. OCLC, "Fiction Finder: A FBBB- 
based Prototype for Fiction in 
WorldCat." www.oclc.org/research/ 
projects/frbr/fictionfinder.htm (acces- 
sed Oct. 21, 2005). 

5. Stefan Gradmann, "rdfs:frbr-Towards 
an Implemenation Model for Lib- 
rary Catalogs Using Semantic Web 
Technology," Cataloging <Lr Classifi- 
cation Quarterly 39, no. 3/4 (2005): 
64,66. 

6. MarthaM . Yee, "FBBBization: A Medr- 
od for Turning Online Public Finding 
Lists into Online Public Catalogs," 
Information Technology and Libraries 
24, no. 2 (June 2005): 78. 

7. Ibid. 

8. Jennifer Bowen, "FBBB Coming Soon 
to Your Library?" Library Resources 
i? Technical Services 49, no. 3 (July 
2005): 175-88. 

9. Ibid., 181. 

10. OCLC, "Membership Beports: 
Information Format Trends: Content, 
Not Containers" (2004). www.oclc 
.org/reports/2004format.htm (acces- 
sed Oct. 23, 2005). 

11. Library of Congress, Cataloging 
and Support Office, "Descriptive 
Cataloging Manual: Draft Interim 



Guidelines for Cataloging Electronic 
Besources" (Dec. 18, 1997). www.loc 
.gov/catdir/cpso/dcmbl9.pdf (acces- 
sed Dec. 19, 2005). 

12. Cataloging and Access Guidelines 
for Electronic Besources Subcom- 
mittee, Technical Services Planning 
Committee, Council of State Univer- 
sity Libraries, "Access and Catalog- 
ing Guidelines for Electronic 
Besources." www.lib.usf.edu/tech 
services/CAGEB/CAGEBGuidelines 
Contents.html (accessed May 5, 
2006). 

13. Program for Cooperative Cataloging, 
Standing Committee on Automation, 
"Task Group on Linking Entries Final 
Beport, Feb. 2005." www.loc.gov/ 
catdir/pcc/archive/tglnkentr-qi t05.pdf 
(accessed May 1, 2006). 

14. Ibid. 

15. Barbara Tillett, "FBBB and Cataloging 
Bules: Impact on IFLAs Statement 
of Principles and AACB/BDA" 
(paper presented at FBBB in 21st 
Century Catalogues: An Invitational 
Workshop, Dublin, Ohio, May 2^4, 
2005). www.oclc.org/researcli/events/ 
frbr-workshop/presentations/tillett/ 
FBBB and cat rules. ppt (accessed 
Dec. 21, 2005). 



51(2) LRTS 



153 



Book Reviews 



Edward Swanson 

Essential Thesaurus Construction. By Vanda Broughton. 
London: Facet Pub., 2006. 296p. $65 paper (ISBN 978-1- 
85604-656-0/1-85604-565-X). 

The Thesaurus: Review, Renaissance, and Revision. 

Eds. Sandra K. Boe and Alan B. Thomas. Binghamton, N.Y.: 
Haworth Infor. Pr., 2004. 209p. $39.95 cloth (ISBN 07890- 
1978-7); $19.95 paper (IBSN 07890-1979-5). Published 
simultaneously as Cataloging h Classification Quarterly, 
37, nos. 3/4. 

As stated in the subtitle of Sandra Boe and Alan 
Thomas's collection of essays, the thesaurus may indeed 
be experiencing a renaissance in our digital era. Many 
information professionals continue to insist that access to 
electronic full-text is the only important precondition for 
resource discovery on the Web. At the same time, others, 
including professionals outside librarianship proper, have 
come to understand that the mediation provided by expertly 
developed and maintained controlled vocabularies is a basic 
requirement for service to populations of all sorts (including 
impatient undergraduates), and not an unwarranted inter- 
ference with the end-user's autonomy. The two volumes 
under review, together, address this topic in complementary 
fashion. 

Vanda Broughton, lecturer in library and information 
studies at the School of Library, Archive and Information 
Studies, University College London, has written a valuable 
manual for students and practitioners. Essential Thesaurus 
Construction is concerned with both the principles and prac- 
tice of thesaurus construction, "with rather more emphasis 
than is usual on the latter" (1). She has at the same time 
provided a practical and detailed introduction to taxonomy 
construction, outlining the basic methods for building a 
subject vocabulary. This is useful not only for the student, 
but also for the information professional who must create a 
thesaurus in-house. 

Broughton's book, while emphasizing practical appli- 
cation in real-world situations, does not slight theoretical 
issues. Instead, chapters with either theoretical or practi- 
cal emphases are integrated in a single logical sequence. 
The opening chapters discuss fundamentals, such as the 
nature of a thesaurus and how it is distinguished from other 
subject access tools, uses of thesauri and their advantages, 
types of thesauri, and the different displays typically pro- 
vided. Practical steps are then described, beginning with 
five chapters on aspects of vocabulary selection and simple 
vocabulary control, a process that results in the needed raw 
material from which a thesaurus is constructed. Chapters 



on uhesaural relationships, facets and arrays, hierarchies, 
and the complex issues surrounding compound subjects and 
citation order, demonstrate the development of the thesau- 
rus from a mass of unstructured terminology to a logically 
developed system. The final chapters concern conversion of 
a classified arrangement to alphabetical format, creation of 
thesaurus records, and methods for maintenance and updat- 
ing. The entire process is illustrated at every stage through 
the actual development of a thesaurus on animal welfare, 
beginning with basic vocabulary sources such as scholarly 
journal articles and Web resources, through to the presenta- 
tion of fully structured entries. It becomes clear drat, while 
the thesaurus and taxonomy construction may not be for 
the timorous, there is a well-marked path to success for the 
determined and careful beginner. 

Clear expositions of often difficult concepts enhance 
the text. These include the thesaurus as indexing tool versus 
organizational or navigational tool (34) and the relation- 
ship of polyhierarchy to notation (179). The discussion 
of compound terms, and the circumstances under which 
they should be factored into simpler terms (beginning on 
page 90), prepares the reader for the complex question of 
whether to provide Broader/Narrower Term (BT/NT) rela- 
tionships, or Belated Term (BT) relationships, to thesaurus 
terms which are retained as compounds (180). 

Valuable pedagogical features complement Broughton's 
lucid prose. Glossary terms, when first appearing in the text, 
are in bold face. The glossary itself is written for non-spe- 
cialists, with an emphasis on "helpful explanations . . . rather 
than precise technical definitions" (208). Most chapters fea- 
ture several summaries, allowing for a quick review of new 
material. Exercises, with answers, are introduced beginning 
with chapter 11, concerning term extraction from docu- 
ment titles. Following the glossary, six appendices allow the 
motivated reader to examine the development of the sample 
thesaurus in detail. 

There were very few errors noted in this well-produced 
volume, which is convenient in format and easily lies flat. 
On page 219 there is the phrase, "one of the Banganathan's 
fundamental categories." The reference on page 248 to the 
animal product "fu" is baffling, until one realizes that "fur" 
is meant. On page 261, the class notation GAP seems out 
of order and probably should have been GP instead. These 
could be easily handled in an updated printing or revised 
edition. Finally, one of the incidental pleasures of this sort of 
book is how it expresses, even through the examples given, 
the vastness and complexity of the spheres of knowledge 
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and human activity. (See, for example, the terms screaming 
hairy armadillo, clog dancing, coastal erosion, and Biblical 
hermeneutics, found on pages 88-89.) 

Roe and Thomas's The Thesaurus: Review, Renaissance, 
and Revision, published two years prior to Broughton's 
work, nevertheless benefits by being read in conjunction 
with die latter. (Disclosure: the reviewer is presently a mem- 
ber of the Cataloging <b Classification Quarterly editorial 
board, but was not at the time of this volume's publication.) 
The editors describe three motivations for producing this 
volume: "to acquaint or remind the Library and Information 
Science (LIS) community of the history of the development 
of the thesaurus and the standards for thesaurus construc- 
tion ... to provide bibliographies and tutorials from which 
any reader can become more grounded in her or his under- 
standing of thesaurus construction, use, and evaluation 
. . . [and] to address topics related to thesauri but that are 
unique to the current digital environment" (1-2). As with all 
such volumes of collected essays, it is almost inevitable that 
readers will find different contributions to be of greater or 
lesser value or interest, but all three of the editors' motiva- 
tions are successfully addressed to different extents. 

"The Thesaurus: A Historical Viewpoint, with a Look 
to the Future," by Jean Aitchison and Stella Dextre Clarke, 
appropriately opens the volume. Aitchison, who compiled the 
pioneer work Thesaurofacet in 1969, and is responsible for 
many faceted thesauri, is coauthor of Thesauri Construction 
and Use, now in its fourth edition and frequently cited by 
Broughton. 1 Following a historical review of the develop- 
ment of the thesaurus as a type of tool, Aitchison and Dextre 
point out the "two major trends in uSesaurus development 
today . . . for adaptations that will make a controlled vocabu- 
lary much quicker, easier, and more intuitive to use . . . [and 
for] interoperability" (14). The cry for more intuitive tools, 
however, runs into the contradiction between more loosely 
defined popular taxonomies on the one hand, and the need 
for "much more precisely defined term relationships" (16) 
for Semantic Web use on the other. 

The next three articles address the intention to pro- 
vide background readings and exercises for the novice. 
Alan R. Thomas's "Teach Yourself Thesaurus: Exercises, 
Readings, Resources" provides readings grouped in topical 
categories, to serve as a basis for "self-instruction in thesau- 
rus-making" (24). Specific exercises are not provided, but 
examples published elsewhere are pointed out. Marianne 
Lykke Nielsen's "Thesaurus Construction: Key Issues and 
Selected Readings" is a more conventional bibliographic 
essay, with readings organized generally around the stages 
of thesaurus construction. "A Practical Exercise in Building 
a Thesaurus," by James R. Shearer, aims to cover in twenty- 
two pages what Essential Thesaurus Construction covers 
in nearly three hundred. It is difficult to picture how this 
extremely condensed exposition could really be of use to 



the beginner. However, Shearer's essay, along with those by 
Thomas and Nielsen, could be of value as a supplement to 
Broughton's work. 

"Thesaurus Consultancy" by Leonard Will and 
"Thesaurus Evaluation" by Leslie Ann Owens and Pauline 
Atherton Cochrane describe professional services of value 
in the construction, testing, and revision stages of thesaurus 
development. The breadth of the latter essay is worthy of 
note. Owens and Cochrane discuss multiple approaches 
to evaluation — the "comparative, observational, formative, 
and structural methods" — and their applications in myriad 
contexts, "online or printed [thesauri], machine or human- 
generated, stand-alone or integrated, monolingual or bilin- 
gual, standards-compliant or not" (99). End-user research 
is represented in this volume by Jane Greenberg's "User 
Comprehension and Searching with Information Retrieval 
Thesauri." Business school graduate students searching 
ABI/Inform were studied regarding their past experience 
with use of thesauri in online searching, their desire to use 
thesauri once introduced to them, and their preferred "pro- 
cessing methods" (112) when working with thesauri. One 
outcome is the conclusion that users prefer "either interac- 
tive or a combination of automatic and interactive thesaurus 
processing to completely automatic processing" (116). 

Of die remaining contributions, Eric H. Johnson's 
"Distributed Thesaurus Web Services" and Patrice Landry's 
"Multilingual Subject Access: The Linking Approach of 
MACS" are of great interest. Johnson's forward-looking 
essay tackles the challenge of making the online subject the- 
sauri much more useful for searching by the general Web- 
using public. He describes the Thesauro-Web, "a proposed 
network of thesaurus access and navigation services" (121), 
using a distinctly developed XML-based markup language 
and user interface. The idea that "you can search the Web 
more easily and effectively using specialized search applica- 
tions, only using the Web browser to fetch and display the 
actual Web documents" (127) is intriguing. Finally, Landry's 
description of the European MACS (Multilingual Access 
to Subjects) project, while not primarily concerned with 
thesaurus development per se, is valuable in that it raises 
the issue of interoperability, not only among languages, but 
more broadly among disciplines and controlled vocabularies 
of different levels of granularity. Taken together, Johnson 
and Landry remind us of the work that still needs to be done 
to provide access to the digital universe, by means that are as 
intelligent as they are intelligible. — David Miller, dmiller® 
curnj.edu, Curry College, Milton, Mass. 
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Cataloging Correctly for Kids. Eds. Sheila S. Intner, 
Joanna F. Fountain, and Jane E. Gilchrist. Chicago: ALA, 
2006. 136p. $32 ($28.80 ALA members) paper (ISBN 0- 
8389-3559-1). 

The editors of this fourth edition gathered leaders 
from all types of libraries to discuss the hows and whys of 
cataloging and accessing children's materials in public and 
academic libraries. A nice blend of historical background is 
coupled with daily operating tactics resulting in a readable 
text; this was a nice surprise considering texts written on 
the subject of cataloging are oftentimes filled mostly with 
rules and regulations. As with the previous editions, the 
narrow focus on children is transcended in about half the 
chapters — much of the book's information applies to best 
cataloging practices in general and so might be useful to a 
variety of cataloging professionals. 

The first two chapters focus on the general thought pro- 
cesses for catalogers when producing records geared toward 
a young audience and what access points are needed for 
children to be successful when searching the online public 
access catalog (OPAC). Helpful reminders to the cataloger 
are sprinkled throughout, such as the routine inclusion of 
tag 586 to indicate a Caldecott, Newbery, or other award- 
winning title (9). Many cataloging records produced by 
the Library of Congress do not include this information, so 
these added data can be essential to locating all the previous 
winners with one OPAC search. 

Next, general cataloging issues are discussed in chapters 
4 through 6 on the topics of MARC, copy cataloging, and 
authority control. These subjects are often presented as 
scenarios in easy-to-understand summaries and examples of 
the rules and regulations. A prime example of the far-reach- 
ing arms of catalog records is illustrated when a hypothetical 
patron named John orders a book through interlibrary loan 
for his research paper; he receives a copy from your library; 
the information he needs resides in the introduction; your 



copy lacks this section and John never uses the library again 
(45). Chapter 3 on copy cataloging includes a brief history 
of the development of MARC as well as an authoritative 
yet lighthearted step-by-step procedure for cataloging that 
begins with, "Here's how it works: Picture me with a book 
or video, or something other fascinating library resource in 
one hand" (26). Examples such as these are what make this 
edition stand out among other books on cataloging. 

More focus on the children's and juvenile collections 
is found in chapters 7 through 9 that address Sears subject 
headings, Dewey call numbers, and cataloging of non-book 
materials. A brief historical background on Dewey and his 
classification scheme and the operating bodies involved with 
maintaining and overseeing changes are found in chapter 8. 

The title of chapter 10, "How the CIP Program Helps 
Children," gave me a chuckle. I am not sure it helps children 
in a direct way, as the title implies. However, I feel certain 
it does assist the cataloger of children's material. The major- 
ity of chapter 11 discusses a 2000 study on how academic 
libraries handle juvenile collections. Most classify juvenile 
fiction and nonfiction together, use labels to identify the col- 
lection, and use both Library of Congress Subject Headings 
and Library of Congress Annotated Card Program Subject 
Headings when assigning subject headings. 

Chapters on library automation and vendors drat sup- 
ply cataloging records complete the main text. A good list 
of questions to ask a proposed vendor as well as names 
and addresses are included at the end of chapter 12, and 
the text concludes with a glossary of acronyms, a bibliog- 
raphy, and an index. The heart of the book, from which all 
chapters seem to radiate, is to keep your audience in mind 
while following cataloging rules and regulations and if these 
rules allow, modify the cataloging record to give die young 
searcher the best possible chance of locating library mate- 
rials. — Deana Groves, deana.groves@ivku.edu, Western 
Kentucky University, Bowling Green. 
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Use the Web more effectively in your library 
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WEB LIBRARIANSHIP™ 

Editor: Jody Condit Fagan 
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Information Use Management and Policy Institute, College of Information, 
Florida State University, Tallahassee, Florida 
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to the literature of web librarianship. The articles are WELL- 
WRITTEN, EVIDENCE-DRIVEN, AND FOCUSED ON THE SOLID CORE 
OF WHAT WE DO AS PROFESSIONALS and what matters to us as a 
profession." 

— Karen G. Schneider, Acting Associate Director of Libraries for Technology & Research, 
Florida State University 

Quarterly journal providing a needed forum for research and promotion 
of the multidimensional virtual world of libraries on the Web. 

A Selection of Contents from the Charter Issue: 

• The International Dunhuang Project (Sarah Beasley and Candice Kail) 

• Web Access to Electronic Journals and Databases in ARL Libraries 
(Dana M. Caudle and Cecilia M. Schmitz) 

• Stop Reinventing the Wheel: Using Wikis for Professional Knowledge 
Sharing (Anne-Marie Deitering and Rachel Bridgewater) 

• A Literature Review of Academic Library Web Page Studies 
(Barbara A. Blummer) 

• Firefox Search Plug-ins: Searching Your Library in the Browser 
(Michael Sauers) 

• From Zero to Wiki: Proposing and Implementing a Library Wiki (Jon Haupt) 
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• more! 
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An Introduction for 
Library Technicians 

Katie Wilson, MLS 
(Grad. Dip. Librarianship) 




"WILL BE A USEFUL 
REFERENCE FOR YEARS TO 
COME. This book provides the 
reader with a well-balanced 
perspective from the broad overview 
level to an understandable level of 
detail, expressed clearly and simply." 
— Brenda McConchie, Grad Dip 
(Advanced Librarianship), AALIA, 
Director, Solved at McConchie Pty. Ltd. 

Computers in Libraries examines the 
impact of integrated library management 
systems, digital resources, and the Internet 
on the functions and operations of library 
technicians and assistants. 

$19.95 soft. ISBN-13: 97&0-7890-2151-9. 
S29.95 hard. ISBN-13: 978-0-7890-2150-2. 
2006. 194 pp. with Index. 

Libraries and Google® 

Edited by 

William Miller, PhD, MLS, BA, 
and Rita M. Pellen, MLS, BA 

"If you haven't pre-ordered 
this book, you might want 
to place your order soon. 
Libraries and Google is A 
MUST-HAVE BOOK FOR ALL 
LIBRARIANS AND LIBRARY STAFF" 
— Google Librarian 

Libraries and Google presents 
leading authorities discussing the many 
possibilities of using Google products as 
effective, user-friendly tools in libraries. 

$24.95 soft. ISBN-13: 978-0-7890-3125-9. 
$34.95 hard. ISBN-13: 978-0-7890-3124-2. 
2006. 240 pp. with Index. 
(Published simultaneously as 
Internet Reference Services Quarterly, 
Vol. 10, Nos. 3/4.) 




ELECTRONIC JOURNAL COLLECTION IN LIBRARY AND INFORMATION SCIENCE 

This powerful new resource is 
comprised of all Haworth journals in 
library and information science and 
includes 34 peer-reviewed journals 
and other new publications as they are 
developed. 

For more information, visit: 
www.HaworthPress.com/collections 



FEATURES: 

• Full-text access to issues in the current 
volumes! 

• Electronic access available 24 hours a day, 
7 days a week, 365 days a year! 

• New issues available online in advance 
of print issues! 

• Receive notice of when new issues are 
available online! 



Complete abstracts are included for most 
articles! 

Easy-to-use features: by author, by article title, 
by keyword, by phrase, Boolean searches, and 
nested searches! 

There is no limit to the number of electronic 
subscribers at your college, university, or 
institution! 

No passwords to remember! 



Usage Statistics 
of E-Serials 

Edited by David C. Fowler, MLS 

Usage Statistics of E-Serials tackles 
this difficult issue by exploring in detail 
the proper evaluation of the level of 
usage of electronic resources. Noted 
experts discuss their own experiences 
in the field from multiple viewpoints 
and backgrounds, providing the reader 
with a well-rounded view of the entire 
topic. Issues comprehensively examined 
include the gathering and processing of 
statistics, costs and benefits of e-journals, 
evaluation and interpretation of data, and 
the comparison of different types of data 
collection methods. 

$39.95 soft. ISBN-13: 978-0-7890-2988-1. 
$69.95 hard. ISBN-13: 978-0-7890-2987-4. 
Available Summer 2007. 
Approx. 338 pp. with Index. 

Collection Management 
and Strategic Access 
to Digital Resources 

The New Challenges 
for Research Libraries 
Edited by Sul H. Lee 

"All the contributors frame ' 
current significant issues 
in collection management ^H SL 
and access to digital 
resources." 

— Technicalities (Laverna Saunders) 

$19.95 soft. ISBN-13: 978-0-7890-2936-2. 

$39.95 hard. ISBN-13: 978-0-7890-2935-5. 

2005. 151 pp. with Index. 

(Published simultaneously as the 

Journal of Library Administration, Vol. 42, No. 2.) 

Portals and Libraries 



Edited by 

Sarah C. Michalak, MLS 

"PROVIDES DETAILED 
BACKGROUND on a variety 
of portal projects and 
offers frank observations 
on their successes and 
failures." 

— Carla J. Stoffle, Dean, University of Arizona 
Libraries and the Center for Creative 
Photography 

$29.95 soft. ISBN-13: 978-0-7890-2932-4. 
$49.95 hard. ISBN-13: 978-0-7890-2931-7. 
2005. 228 pp. with Index. 
(Published simultaneously as the 
Journal of Library Administration, 
Vol. 43, Nos. 1/2.) 
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